This notebook is a template with each step that you need to complete for the project.
Please fill in your code where there are explicit ? markers in the notebook. You are welcome to add more cells and code as you see fit.
Once you have completed all the code implementations, please export your notebook as a HTML file so the reviews can view your code. Make sure you have all outputs correctly outputted.
File-> Export Notebook As... -> Export Notebook as HTML
There is a writeup to complete as well after all code implememtation is done. Please answer all questions and attach the necessary tables and charts. You can complete the writeup in either markdown or PDF.
Completing the code template and writeup template will cover all of the rubric points for this project.
The rubric contains "Stand Out Suggestions" for enhancing the project beyond the minimum requirements. The stand out suggestions are optional. If you decide to pursue the "stand out suggestions", you can include the code in this notebook and also discuss the results in the writeup file.
Below is example of steps to get the API username and key. Each student will have their own username and key.
kaggle.json and use the username and key.
ml.t3.medium instance (2 vCPU + 4 GiB)Python 3 (MXNet 1.8 Python 3.7 CPU Optimized)!pip install -U pip
!pip install -U setuptools wheel
!pip install -U "mxnet<2.0.0" bokeh==2.0.1
!pip install autogluon --no-cache-dir
!pip install kaggle
!pip install pandas-profiling
!pip install ipywidgets
# Without --no-cache-dir, smaller aws instances may have trouble installing
Requirement already satisfied: pip in /usr/local/lib/python3.7/site-packages (21.3.1)
Collecting pip
Using cached pip-22.0.4-py3-none-any.whl (2.1 MB)
Installing collected packages: pip
Attempting uninstall: pip
Found existing installation: pip 21.3.1
Uninstalling pip-21.3.1:
Successfully uninstalled pip-21.3.1
Successfully installed pip-22.0.4
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Requirement already satisfied: setuptools in /usr/local/lib/python3.7/site-packages (59.4.0)
Collecting setuptools
Using cached setuptools-60.10.0-py3-none-any.whl (1.1 MB)
Collecting wheel
Using cached wheel-0.37.1-py2.py3-none-any.whl (35 kB)
Installing collected packages: wheel, setuptools
Attempting uninstall: setuptools
Found existing installation: setuptools 59.4.0
Uninstalling setuptools-59.4.0:
Successfully uninstalled setuptools-59.4.0
Successfully installed setuptools-60.10.0 wheel-0.37.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Collecting mxnet<2.0.0
Using cached mxnet-1.9.0-py3-none-manylinux2014_x86_64.whl (47.3 MB)
Collecting bokeh==2.0.1
Using cached bokeh-2.0.1-py3-none-any.whl
Requirement already satisfied: typing-extensions>=3.7.4 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (4.0.1)
Requirement already satisfied: packaging>=16.8 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (21.3)
Requirement already satisfied: tornado>=5 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (6.1)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (2.8.2)
Requirement already satisfied: Jinja2>=2.7 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (3.0.3)
Requirement already satisfied: numpy>=1.11.3 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (1.19.1)
Requirement already satisfied: PyYAML>=3.10 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (5.4.1)
Requirement already satisfied: pillow>=4.0 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (8.4.0)
Requirement already satisfied: graphviz<0.9.0,>=0.8.1 in /usr/local/lib/python3.7/site-packages (from mxnet<2.0.0) (0.8.4)
Requirement already satisfied: requests<3,>=2.20.0 in /usr/local/lib/python3.7/site-packages (from mxnet<2.0.0) (2.22.0)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.7/site-packages (from Jinja2>=2.7->bokeh==2.0.1) (2.0.1)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.7/site-packages (from packaging>=16.8->bokeh==2.0.1) (3.0.6)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/site-packages (from python-dateutil>=2.1->bokeh==2.0.1) (1.16.0)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /usr/local/lib/python3.7/site-packages (from requests<3,>=2.20.0->mxnet<2.0.0) (3.0.4)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/site-packages (from requests<3,>=2.20.0->mxnet<2.0.0) (1.25.11)
Requirement already satisfied: idna<2.9,>=2.5 in /usr/local/lib/python3.7/site-packages (from requests<3,>=2.20.0->mxnet<2.0.0) (2.8)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/site-packages (from requests<3,>=2.20.0->mxnet<2.0.0) (2021.10.8)
Installing collected packages: mxnet, bokeh
Attempting uninstall: bokeh
Found existing installation: bokeh 2.4.2
Uninstalling bokeh-2.4.2:
Successfully uninstalled bokeh-2.4.2
Successfully installed bokeh-2.0.1 mxnet-1.9.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Collecting autogluon
Downloading autogluon-0.4.0-py3-none-any.whl (9.5 kB)
Collecting autogluon.vision==0.4.0
Downloading autogluon.vision-0.4.0-py3-none-any.whl (48 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 48.6/48.6 KB 58.0 MB/s eta 0:00:00
Collecting autogluon.core[all]==0.4.0
Downloading autogluon.core-0.4.0-py3-none-any.whl (188 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 188.2/188.2 KB 169.1 MB/s eta 0:00:00
Collecting autogluon.features==0.4.0
Downloading autogluon.features-0.4.0-py3-none-any.whl (59 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 59.1/59.1 KB 176.6 MB/s eta 0:00:00
Collecting autogluon.tabular[all]==0.4.0
Downloading autogluon.tabular-0.4.0-py3-none-any.whl (267 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 267.0/267.0 KB 220.8 MB/s eta 0:00:00
Collecting autogluon.text==0.4.0
Downloading autogluon.text-0.4.0-py3-none-any.whl (133 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 133.7/133.7 KB 197.0 MB/s eta 0:00:00
Requirement already satisfied: pandas<1.4,>=1.2.5 in /usr/local/lib/python3.7/site-packages (from autogluon.core[all]==0.4.0->autogluon) (1.3.4)
Collecting scipy<1.8.0,>=1.5.4
Downloading scipy-1.7.3-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (38.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 38.1/38.1 MB 164.5 MB/s eta 0:00:00a 0:00:01
Requirement already satisfied: tqdm>=4.38.0 in /usr/local/lib/python3.7/site-packages (from autogluon.core[all]==0.4.0->autogluon) (4.39.0)
Collecting autogluon.common==0.4.0
Downloading autogluon.common-0.4.0-py3-none-any.whl (37 kB)
Collecting numpy<1.23,>=1.21
Downloading numpy-1.21.5-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 15.7/15.7 MB 159.0 MB/s eta 0:00:00a 0:00:01
Requirement already satisfied: boto3 in /usr/local/lib/python3.7/site-packages (from autogluon.core[all]==0.4.0->autogluon) (1.20.17)
Requirement already satisfied: requests in /usr/local/lib/python3.7/site-packages (from autogluon.core[all]==0.4.0->autogluon) (2.22.0)
Requirement already satisfied: scikit-learn<1.1,>=1.0.0 in /usr/local/lib/python3.7/site-packages (from autogluon.core[all]==0.4.0->autogluon) (1.0.1)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.7/site-packages (from autogluon.core[all]==0.4.0->autogluon) (3.5.0)
Collecting dask<=2021.11.2,>=2021.09.1
Downloading dask-2021.11.2-py3-none-any.whl (1.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.0/1.0 MB 238.1 MB/s eta 0:00:00
Collecting distributed<=2021.11.2,>=2021.09.1
Downloading distributed-2021.11.2-py3-none-any.whl (802 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 802.2/802.2 KB 217.5 MB/s eta 0:00:00
Collecting ray<1.9,>=1.7
Downloading ray-1.8.0-cp37-cp37m-manylinux2014_x86_64.whl (54.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 54.7/54.7 MB 145.7 MB/s eta 0:00:00a 0:00:01
Requirement already satisfied: psutil<5.9,>=5.7.3 in /usr/local/lib/python3.7/site-packages (from autogluon.features==0.4.0->autogluon) (5.8.0)
Requirement already satisfied: networkx<3.0,>=2.3 in /usr/local/lib/python3.7/site-packages (from autogluon.tabular[all]==0.4.0->autogluon) (2.6.3)
Collecting fastai<2.6,>=2.3.1
Downloading fastai-2.5.3-py3-none-any.whl (189 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 189.5/189.5 KB 214.4 MB/s eta 0:00:00
Collecting xgboost<1.5,>=1.4
Downloading xgboost-1.4.2-py3-none-manylinux2010_x86_64.whl (166.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 166.7/166.7 MB 154.4 MB/s eta 0:00:0000:0100:01
Collecting catboost<1.1,>=1.0
Downloading catboost-1.0.4-cp37-none-manylinux1_x86_64.whl (76.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 76.1/76.1 MB 153.9 MB/s eta 0:00:0000:0100:01
Collecting torch<1.11,>=1.0
Downloading torch-1.10.2-cp37-cp37m-manylinux1_x86_64.whl (881.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 881.9/881.9 MB 132.7 MB/s eta 0:00:0000:0100:01
Collecting lightgbm<3.4,>=3.3
Downloading lightgbm-3.3.2-py3-none-manylinux1_x86_64.whl (2.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 171.5 MB/s eta 0:00:00
Collecting omegaconf<2.2.0,>=2.1.1
Downloading omegaconf-2.1.1-py3-none-any.whl (74 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 74.7/74.7 KB 187.3 MB/s eta 0:00:00
Collecting pytorch-lightning<1.6.0,>=1.5.10
Downloading pytorch_lightning-1.5.10-py3-none-any.whl (527 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 527.7/527.7 KB 209.4 MB/s eta 0:00:00
Collecting autogluon-contrib-nlp==0.0.1b20220208
Downloading autogluon_contrib_nlp-0.0.1b20220208-py3-none-any.whl (157 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 157.3/157.3 KB 207.1 MB/s eta 0:00:00
Collecting sentencepiece<0.2.0,>=0.1.95
Downloading sentencepiece-0.1.96-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 193.7 MB/s eta 0:00:00
Collecting transformers<4.17.0,>=4.16.2
Downloading transformers-4.16.2-py3-none-any.whl (3.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.5/3.5 MB 201.3 MB/s eta 0:00:00
Collecting Pillow<9.1.0,>=9.0.0
Downloading Pillow-9.0.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.3/4.3 MB 161.5 MB/s eta 0:00:00
Collecting scikit-image<0.20.0,>=0.19.1
Downloading scikit_image-0.19.2-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (13.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.5/13.5 MB 155.6 MB/s eta 0:00:00a 0:00:01
Collecting fairscale<0.5.0,>=0.4.5
Downloading fairscale-0.4.6.tar.gz (248 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 248.2/248.2 KB 226.8 MB/s eta 0:00:00
Installing build dependencies ... done
Getting requirements to build wheel ... done
Installing backend dependencies ... done
Preparing metadata (pyproject.toml) ... done
Collecting smart-open<5.3.0,>=5.2.1
Downloading smart_open-5.2.1-py3-none-any.whl (58 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.6/58.6 KB 179.3 MB/s eta 0:00:00
Collecting timm<0.6.0,>=0.5.4
Downloading timm-0.5.4-py3-none-any.whl (431 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 431.5/431.5 KB 241.2 MB/s eta 0:00:00
Collecting torchmetrics<0.8.0,>=0.7.2
Downloading torchmetrics-0.7.2-py3-none-any.whl (397 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 397.2/397.2 KB 224.9 MB/s eta 0:00:00
Collecting nptyping<1.5.0,>=1.4.4
Downloading nptyping-1.4.4-py3-none-any.whl (31 kB)
Collecting gluoncv<0.10.6,>=0.10.5
Downloading gluoncv-0.10.5-py2.py3-none-any.whl (1.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 170.6 MB/s eta 0:00:00
Collecting tokenizers>=0.9.4
Downloading tokenizers-0.11.6-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.5/6.5 MB 166.1 MB/s eta 0:00:0000:01
Collecting yacs>=0.1.6
Downloading yacs-0.1.8-py3-none-any.whl (14 kB)
Collecting sacremoses>=0.0.38
Downloading sacremoses-0.0.49-py3-none-any.whl (895 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 895.2/895.2 KB 236.4 MB/s eta 0:00:00
Collecting sacrebleu
Downloading sacrebleu-2.0.0-py3-none-any.whl (90 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 90.7/90.7 KB 197.6 MB/s eta 0:00:00
Requirement already satisfied: protobuf in /usr/local/lib/python3.7/site-packages (from autogluon-contrib-nlp==0.0.1b20220208->autogluon.text==0.4.0->autogluon) (3.19.1)
Collecting contextvars
Downloading contextvars-2.4.tar.gz (9.6 kB)
Preparing metadata (setup.py) ... done
Collecting flake8
Downloading flake8-4.0.1-py2.py3-none-any.whl (64 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 64.1/64.1 KB 122.4 MB/s eta 0:00:00
Collecting regex
Downloading regex-2022.3.15-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (749 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 749.6/749.6 KB 221.7 MB/s eta 0:00:00
Collecting sentencepiece<0.2.0,>=0.1.95
Downloading sentencepiece-0.1.95-cp37-cp37m-manylinux2014_x86_64.whl (1.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 237.5 MB/s eta 0:00:00
Requirement already satisfied: pyarrow in /usr/local/lib/python3.7/site-packages (from autogluon-contrib-nlp==0.0.1b20220208->autogluon.text==0.4.0->autogluon) (6.0.1)
Requirement already satisfied: six in /usr/local/lib/python3.7/site-packages (from catboost<1.1,>=1.0->autogluon.tabular[all]==0.4.0->autogluon) (1.16.0)
Requirement already satisfied: graphviz in /usr/local/lib/python3.7/site-packages (from catboost<1.1,>=1.0->autogluon.tabular[all]==0.4.0->autogluon) (0.8.4)
Requirement already satisfied: plotly in /usr/local/lib/python3.7/site-packages (from catboost<1.1,>=1.0->autogluon.tabular[all]==0.4.0->autogluon) (5.4.0)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.7/site-packages (from dask<=2021.11.2,>=2021.09.1->autogluon.core[all]==0.4.0->autogluon) (21.3)
Requirement already satisfied: fsspec>=0.6.0 in /usr/local/lib/python3.7/site-packages (from dask<=2021.11.2,>=2021.09.1->autogluon.core[all]==0.4.0->autogluon) (2021.11.1)
Requirement already satisfied: cloudpickle>=1.1.1 in /usr/local/lib/python3.7/site-packages (from dask<=2021.11.2,>=2021.09.1->autogluon.core[all]==0.4.0->autogluon) (2.0.0)
Collecting toolz>=0.8.2
Downloading toolz-0.11.2-py3-none-any.whl (55 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 55.8/55.8 KB 167.3 MB/s eta 0:00:00
Collecting partd>=0.3.10
Downloading partd-1.2.0-py3-none-any.whl (19 kB)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.7/site-packages (from dask<=2021.11.2,>=2021.09.1->autogluon.core[all]==0.4.0->autogluon) (5.4.1)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.7/site-packages (from distributed<=2021.11.2,>=2021.09.1->autogluon.core[all]==0.4.0->autogluon) (3.0.3)
Collecting sortedcontainers!=2.0.0,!=2.0.1
Downloading sortedcontainers-2.4.0-py2.py3-none-any.whl (29 kB)
Requirement already satisfied: setuptools in /usr/local/lib/python3.7/site-packages (from distributed<=2021.11.2,>=2021.09.1->autogluon.core[all]==0.4.0->autogluon) (60.10.0)
Collecting msgpack>=0.6.0
Downloading msgpack-1.0.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (299 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 299.4/299.4 KB 224.0 MB/s eta 0:00:00
Collecting tblib>=1.6.0
Downloading tblib-1.7.0-py2.py3-none-any.whl (12 kB)
Collecting zict>=0.1.3
Downloading zict-2.1.0-py3-none-any.whl (11 kB)
Requirement already satisfied: tornado>=5 in /usr/local/lib/python3.7/site-packages (from distributed<=2021.11.2,>=2021.09.1->autogluon.core[all]==0.4.0->autogluon) (6.1)
Collecting click>=6.6
Downloading click-8.0.4-py3-none-any.whl (97 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 97.5/97.5 KB 167.8 MB/s eta 0:00:00
Collecting fastcore<1.4,>=1.3.22
Downloading fastcore-1.3.29-py3-none-any.whl (55 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.0/56.0 KB 171.8 MB/s eta 0:00:00
Collecting fastdownload<2,>=0.0.5
Downloading fastdownload-0.0.5-py3-none-any.whl (13 kB)
Collecting spacy<4
Downloading spacy-3.2.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.0/6.0 MB 191.2 MB/s eta 0:00:00
Collecting torchvision>=0.8.2
Downloading torchvision-0.12.0-cp37-cp37m-manylinux1_x86_64.whl (21.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.0/21.0 MB 147.2 MB/s eta 0:00:0000:0100:01
Collecting fastprogress>=0.2.4
Downloading fastprogress-1.0.2-py3-none-any.whl (12 kB)
Requirement already satisfied: pip in /usr/local/lib/python3.7/site-packages (from fastai<2.6,>=2.3.1->autogluon.tabular[all]==0.4.0->autogluon) (22.0.4)
Requirement already satisfied: portalocker in /usr/local/lib/python3.7/site-packages (from gluoncv<0.10.6,>=0.10.5->autogluon.vision==0.4.0->autogluon) (2.3.2)
Requirement already satisfied: opencv-python in /usr/local/lib/python3.7/site-packages (from gluoncv<0.10.6,>=0.10.5->autogluon.vision==0.4.0->autogluon) (4.5.4.60)
Collecting autocfg
Downloading autocfg-0.0.8-py3-none-any.whl (13 kB)
Requirement already satisfied: wheel in /usr/local/lib/python3.7/site-packages (from lightgbm<3.4,>=3.3->autogluon.tabular[all]==0.4.0->autogluon) (0.37.1)
Collecting typish>=1.7.0
Downloading typish-1.9.3-py3-none-any.whl (45 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.1/45.1 KB 152.1 MB/s eta 0:00:00
Collecting antlr4-python3-runtime==4.8
Downloading antlr4-python3-runtime-4.8.tar.gz (112 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 112.4/112.4 KB 207.4 MB/s eta 0:00:00
Preparing metadata (setup.py) ... done
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.7/site-packages (from pandas<1.4,>=1.2.5->autogluon.core[all]==0.4.0->autogluon) (2.8.2)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/site-packages (from pandas<1.4,>=1.2.5->autogluon.core[all]==0.4.0->autogluon) (2021.3)
Collecting future>=0.17.1
Downloading future-0.18.2.tar.gz (829 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 829.2/829.2 KB 185.4 MB/s eta 0:00:00
Preparing metadata (setup.py) ... done
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/site-packages (from pytorch-lightning<1.6.0,>=1.5.10->autogluon.text==0.4.0->autogluon) (4.0.1)
Collecting setuptools
Downloading setuptools-59.5.0-py3-none-any.whl (952 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 952.4/952.4 KB 211.3 MB/s eta 0:00:00
Collecting pyDeprecate==0.3.1
Downloading pyDeprecate-0.3.1-py3-none-any.whl (10 kB)
Collecting tqdm>=4.38.0
Downloading tqdm-4.63.0-py2.py3-none-any.whl (76 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 76.6/76.6 KB 196.2 MB/s eta 0:00:00
Collecting tensorboard>=2.2.0
Downloading tensorboard-2.8.0-py3-none-any.whl (5.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.8/5.8 MB 157.9 MB/s eta 0:00:00a 0:00:01
Collecting grpcio>=1.28.1
Downloading grpcio-1.44.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.3/4.3 MB 168.9 MB/s eta 0:00:00
Requirement already satisfied: attrs in /usr/local/lib/python3.7/site-packages (from ray<1.9,>=1.7->autogluon.core[all]==0.4.0->autogluon) (21.2.0)
Collecting jsonschema
Downloading jsonschema-4.4.0-py3-none-any.whl (72 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 72.7/72.7 KB 174.7 MB/s eta 0:00:00
Collecting redis>=3.5.0
Downloading redis-4.1.4-py3-none-any.whl (175 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 175.8/175.8 KB 199.3 MB/s eta 0:00:00
Collecting filelock
Downloading filelock-3.6.0-py3-none-any.whl (10.0 kB)
Collecting tifffile>=2019.7.26
Downloading tifffile-2021.11.2-py3-none-any.whl (178 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 178.9/178.9 KB 219.9 MB/s eta 0:00:00
Collecting PyWavelets>=1.1.1
Downloading PyWavelets-1.3.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.4/6.4 MB 146.4 MB/s eta 0:00:00a 0:00:01
Requirement already satisfied: imageio>=2.4.1 in /usr/local/lib/python3.7/site-packages (from scikit-image<0.20.0,>=0.19.1->autogluon.text==0.4.0->autogluon) (2.13.1)
Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/site-packages (from scikit-learn<1.1,>=1.0.0->autogluon.core[all]==0.4.0->autogluon) (1.1.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/site-packages (from scikit-learn<1.1,>=1.0.0->autogluon.core[all]==0.4.0->autogluon) (3.0.0)
Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.7/site-packages (from transformers<4.17.0,>=4.16.2->autogluon.text==0.4.0->autogluon) (4.8.2)
Collecting huggingface-hub<1.0,>=0.1.0
Downloading huggingface_hub-0.4.0-py3-none-any.whl (67 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 67.0/67.0 KB 159.9 MB/s eta 0:00:00
Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /usr/local/lib/python3.7/site-packages (from boto3->autogluon.core[all]==0.4.0->autogluon) (0.10.0)
Requirement already satisfied: botocore<1.24.0,>=1.23.17 in /usr/local/lib/python3.7/site-packages (from boto3->autogluon.core[all]==0.4.0->autogluon) (1.23.17)
Requirement already satisfied: s3transfer<0.6.0,>=0.5.0 in /usr/local/lib/python3.7/site-packages (from boto3->autogluon.core[all]==0.4.0->autogluon) (0.5.0)
Requirement already satisfied: pyparsing>=2.2.1 in /usr/local/lib/python3.7/site-packages (from matplotlib->autogluon.core[all]==0.4.0->autogluon) (3.0.6)
Requirement already satisfied: setuptools-scm>=4 in /usr/local/lib/python3.7/site-packages (from matplotlib->autogluon.core[all]==0.4.0->autogluon) (6.3.2)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/site-packages (from matplotlib->autogluon.core[all]==0.4.0->autogluon) (0.11.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/site-packages (from matplotlib->autogluon.core[all]==0.4.0->autogluon) (1.3.2)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.7/site-packages (from matplotlib->autogluon.core[all]==0.4.0->autogluon) (4.28.2)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/site-packages (from requests->autogluon.core[all]==0.4.0->autogluon) (1.25.11)
Requirement already satisfied: idna<2.9,>=2.5 in /usr/local/lib/python3.7/site-packages (from requests->autogluon.core[all]==0.4.0->autogluon) (2.8)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/site-packages (from requests->autogluon.core[all]==0.4.0->autogluon) (2021.10.8)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /usr/local/lib/python3.7/site-packages (from requests->autogluon.core[all]==0.4.0->autogluon) (3.0.4)
Collecting aiohttp
Downloading aiohttp-3.8.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 223.6 MB/s eta 0:00:00
Collecting locket
Downloading locket-0.2.1-py2.py3-none-any.whl (4.1 kB)
Collecting deprecated>=1.2.3
Downloading Deprecated-1.2.13-py2.py3-none-any.whl (9.6 kB)
Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/site-packages (from importlib-metadata->transformers<4.17.0,>=4.16.2->autogluon.text==0.4.0->autogluon) (3.6.0)
Requirement already satisfied: tomli>=1.0.0 in /usr/local/lib/python3.7/site-packages (from setuptools-scm>=4->matplotlib->autogluon.core[all]==0.4.0->autogluon) (1.2.2)
Collecting blis<0.8.0,>=0.4.0
Downloading blis-0.7.6-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.9/9.9 MB 166.5 MB/s eta 0:00:00a 0:00:01
Collecting catalogue<2.1.0,>=2.0.6
Downloading catalogue-2.0.6-py3-none-any.whl (17 kB)
Collecting pathy>=0.3.5
Downloading pathy-0.6.1-py3-none-any.whl (42 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 42.8/42.8 KB 130.2 MB/s eta 0:00:00
Collecting typer<0.5.0,>=0.3.0
Downloading typer-0.4.0-py3-none-any.whl (27 kB)
Collecting pydantic!=1.8,!=1.8.1,<1.9.0,>=1.7.4
Downloading pydantic-1.8.2-cp37-cp37m-manylinux2014_x86_64.whl (10.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.1/10.1 MB 169.6 MB/s eta 0:00:00a 0:00:01
Collecting langcodes<4.0.0,>=3.2.0
Downloading langcodes-3.3.0-py3-none-any.whl (181 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 181.6/181.6 KB 210.5 MB/s eta 0:00:00
Collecting typing-extensions
Downloading typing_extensions-3.10.0.2-py3-none-any.whl (26 kB)
Collecting thinc<8.1.0,>=8.0.12
Downloading thinc-8.0.15-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (653 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 653.3/653.3 KB 207.4 MB/s eta 0:00:00
Collecting spacy-loggers<2.0.0,>=1.0.0
Downloading spacy_loggers-1.0.1-py3-none-any.whl (7.0 kB)
Collecting cymem<2.1.0,>=2.0.2
Downloading cymem-2.0.6-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (35 kB)
Collecting wasabi<1.1.0,>=0.8.1
Downloading wasabi-0.9.0-py3-none-any.whl (25 kB)
Collecting murmurhash<1.1.0,>=0.28.0
Downloading murmurhash-1.0.6-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (21 kB)
Collecting spacy-legacy<3.1.0,>=3.0.8
Downloading spacy_legacy-3.0.9-py2.py3-none-any.whl (20 kB)
Collecting preshed<3.1.0,>=3.0.2
Downloading preshed-3.0.6-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (125 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 125.9/125.9 KB 212.7 MB/s eta 0:00:00
Collecting srsly<3.0.0,>=2.4.1
Downloading srsly-2.4.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (451 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 452.0/452.0 KB 143.7 MB/s eta 0:00:00
Requirement already satisfied: werkzeug>=0.11.15 in /usr/local/lib/python3.7/site-packages (from tensorboard>=2.2.0->pytorch-lightning<1.6.0,>=1.5.10->autogluon.text==0.4.0->autogluon) (2.0.2)
Collecting google-auth<3,>=1.6.3
Downloading google_auth-2.6.2-py2.py3-none-any.whl (156 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 156.5/156.5 KB 199.5 MB/s eta 0:00:00
Collecting markdown>=2.6.8
Downloading Markdown-3.3.6-py3-none-any.whl (97 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 97.8/97.8 KB 189.3 MB/s eta 0:00:00
Collecting google-auth-oauthlib<0.5,>=0.4.1
Downloading google_auth_oauthlib-0.4.6-py2.py3-none-any.whl (18 kB)
Collecting absl-py>=0.4
Downloading absl_py-1.0.0-py3-none-any.whl (126 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 126.7/126.7 KB 201.9 MB/s eta 0:00:00
Collecting tensorboard-data-server<0.7.0,>=0.6.0
Downloading tensorboard_data_server-0.6.1-py3-none-manylinux2010_x86_64.whl (4.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.9/4.9 MB 159.3 MB/s eta 0:00:00
Collecting tensorboard-plugin-wit>=1.6.0
Downloading tensorboard_plugin_wit-1.8.1-py3-none-any.whl (781 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 781.3/781.3 KB 187.3 MB/s eta 0:00:00
Collecting torchvision>=0.8.2
Downloading torchvision-0.11.3-cp37-cp37m-manylinux1_x86_64.whl (23.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.2/23.2 MB 174.7 MB/s eta 0:00:00a 0:00:01
Collecting heapdict
Downloading HeapDict-1.0.1-py3-none-any.whl (3.9 kB)
Collecting immutables>=0.9
Downloading immutables-0.16-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (104 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 104.4/104.4 KB 194.3 MB/s eta 0:00:00
Collecting pyflakes<2.5.0,>=2.4.0
Downloading pyflakes-2.4.0-py2.py3-none-any.whl (69 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 69.7/69.7 KB 181.1 MB/s eta 0:00:00
Collecting mccabe<0.7.0,>=0.6.0
Downloading mccabe-0.6.1-py2.py3-none-any.whl (8.6 kB)
Collecting importlib-metadata
Downloading importlib_metadata-4.2.0-py3-none-any.whl (16 kB)
Collecting pycodestyle<2.9.0,>=2.8.0
Downloading pycodestyle-2.8.0-py2.py3-none-any.whl (42 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 42.1/42.1 KB 136.1 MB/s eta 0:00:00
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.7/site-packages (from jinja2->distributed<=2021.11.2,>=2021.09.1->autogluon.core[all]==0.4.0->autogluon) (2.0.1)
Collecting pyrsistent!=0.17.0,!=0.17.1,!=0.17.2,>=0.14.0
Downloading pyrsistent-0.18.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (117 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 117.1/117.1 KB 195.0 MB/s eta 0:00:00
Collecting importlib-resources>=1.4.0
Downloading importlib_resources-5.4.0-py3-none-any.whl (28 kB)
Requirement already satisfied: tenacity>=6.2.0 in /usr/local/lib/python3.7/site-packages (from plotly->catboost<1.1,>=1.0->autogluon.tabular[all]==0.4.0->autogluon) (8.0.1)
Requirement already satisfied: tabulate>=0.8.9 in /usr/local/lib/python3.7/site-packages (from sacrebleu->autogluon-contrib-nlp==0.0.1b20220208->autogluon.text==0.4.0->autogluon) (0.8.9)
Requirement already satisfied: colorama in /usr/local/lib/python3.7/site-packages (from sacrebleu->autogluon-contrib-nlp==0.0.1b20220208->autogluon.text==0.4.0->autogluon) (0.4.3)
Collecting wrapt<2,>=1.10
Downloading wrapt-1.14.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (75 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 75.2/75.2 KB 176.2 MB/s eta 0:00:00
Collecting cachetools<6.0,>=2.0.0
Downloading cachetools-5.0.0-py3-none-any.whl (9.1 kB)
Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.7/site-packages (from google-auth<3,>=1.6.3->tensorboard>=2.2.0->pytorch-lightning<1.6.0,>=1.5.10->autogluon.text==0.4.0->autogluon) (4.7.2)
Collecting pyasn1-modules>=0.2.1
Downloading pyasn1_modules-0.2.8-py2.py3-none-any.whl (155 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 155.3/155.3 KB 210.1 MB/s eta 0:00:00
Collecting requests-oauthlib>=0.7.0
Downloading requests_oauthlib-1.3.1-py2.py3-none-any.whl (23 kB)
Collecting markdown>=2.6.8
Downloading Markdown-3.3.4-py3-none-any.whl (97 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 97.6/97.6 KB 196.1 MB/s eta 0:00:00
Collecting multidict<7.0,>=4.5
Downloading multidict-6.0.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (94 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 94.8/94.8 KB 202.7 MB/s eta 0:00:00
Collecting charset-normalizer<3.0,>=2.0
Downloading charset_normalizer-2.0.12-py3-none-any.whl (39 kB)
Collecting async-timeout<5.0,>=4.0.0a3
Downloading async_timeout-4.0.2-py3-none-any.whl (5.8 kB)
Collecting yarl<2.0,>=1.0
Downloading yarl-1.7.2-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (271 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 271.8/271.8 KB 225.8 MB/s eta 0:00:00
Collecting aiosignal>=1.1.2
Downloading aiosignal-1.2.0-py3-none-any.whl (8.2 kB)
Collecting frozenlist>=1.1.1
Downloading frozenlist-1.3.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (144 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 144.8/144.8 KB 220.3 MB/s eta 0:00:00
Collecting asynctest==0.13.0
Downloading asynctest-0.13.0-py3-none-any.whl (26 kB)
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /usr/local/lib/python3.7/site-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard>=2.2.0->pytorch-lightning<1.6.0,>=1.5.10->autogluon.text==0.4.0->autogluon) (0.4.8)
Collecting oauthlib>=3.0.0
Downloading oauthlib-3.2.0-py3-none-any.whl (151 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 151.5/151.5 KB 215.9 MB/s eta 0:00:00
Building wheels for collected packages: fairscale, antlr4-python3-runtime, future, contextvars
Building wheel for fairscale (pyproject.toml) ... done
Created wheel for fairscale: filename=fairscale-0.4.6-py3-none-any.whl size=307243 sha256=19083451f40396f909cae921d0a8c6f42f3ca8411d44948dcc9ec8b924b1acff
Stored in directory: /tmp/pip-ephem-wheel-cache-jk10_x_j/wheels/4e/4f/0b/94c29ea06dfad93260cb0377855f87b7b863312317a7f69fe7
Building wheel for antlr4-python3-runtime (setup.py) ... done
Created wheel for antlr4-python3-runtime: filename=antlr4_python3_runtime-4.8-py3-none-any.whl size=141230 sha256=776e936a5005323a8a501a3ba885e41ac7a1ffe16356cd8c9e31ff12930e602a
Stored in directory: /tmp/pip-ephem-wheel-cache-jk10_x_j/wheels/ca/33/b7/336836125fc9bb4ceaa4376d8abca10ca8bc84ddc824baea6c
Building wheel for future (setup.py) ... done
Created wheel for future: filename=future-0.18.2-py3-none-any.whl size=491070 sha256=ab0ccfbad180de3307dc987aec79b302d914a9c9ea8592909896a31ed5078416
Stored in directory: /tmp/pip-ephem-wheel-cache-jk10_x_j/wheels/56/b0/fe/4410d17b32f1f0c3cf54cdfb2bc04d7b4b8f4ae377e2229ba0
Building wheel for contextvars (setup.py) ... done
Created wheel for contextvars: filename=contextvars-2.4-py3-none-any.whl size=7681 sha256=48caa77029580d3683319b4e0e4fd2491ba8392720062188bdf2911c87104625
Stored in directory: /tmp/pip-ephem-wheel-cache-jk10_x_j/wheels/0a/11/79/e70e668095c0bb1f94718af672ef2d35ee7a023fee56ef54d9
Successfully built fairscale antlr4-python3-runtime future contextvars
Installing collected packages: wasabi, typish, typing-extensions, tokenizers, tensorboard-plugin-wit, sortedcontainers, sentencepiece, murmurhash, msgpack, mccabe, heapdict, cymem, antlr4-python3-runtime, zict, yacs, wrapt, tqdm, torch, toolz, tensorboard-data-server, tblib, spacy-loggers, spacy-legacy, smart-open, setuptools, regex, pyrsistent, pyflakes, pyDeprecate, pydantic, pycodestyle, pyasn1-modules, preshed, Pillow, omegaconf, oauthlib, numpy, multidict, locket, langcodes, importlib-resources, importlib-metadata, immutables, grpcio, future, frozenlist, filelock, fastprogress, charset-normalizer, catalogue, cachetools, autocfg, asynctest, async-timeout, absl-py, yarl, torchvision, torchmetrics, tifffile, srsly, scipy, sacrebleu, requests-oauthlib, PyWavelets, partd, nptyping, markdown, jsonschema, huggingface-hub, google-auth, flake8, fastcore, fairscale, deprecated, contextvars, click, blis, aiosignal, xgboost, typer, timm, thinc, scikit-image, sacremoses, redis, google-auth-oauthlib, fastdownload, dask, aiohttp, transformers, tensorboard, ray, pathy, lightgbm, gluoncv, distributed, catboost, autogluon-contrib-nlp, spacy, pytorch-lightning, autogluon.common, fastai, autogluon.features, autogluon.core, autogluon.vision, autogluon.text, autogluon.tabular, autogluon
Attempting uninstall: typing-extensions
Found existing installation: typing_extensions 4.0.1
Uninstalling typing_extensions-4.0.1:
Successfully uninstalled typing_extensions-4.0.1
Attempting uninstall: tqdm
Found existing installation: tqdm 4.39.0
Uninstalling tqdm-4.39.0:
Successfully uninstalled tqdm-4.39.0
Attempting uninstall: setuptools
Found existing installation: setuptools 60.10.0
Uninstalling setuptools-60.10.0:
Successfully uninstalled setuptools-60.10.0
Attempting uninstall: Pillow
Found existing installation: Pillow 8.4.0
Uninstalling Pillow-8.4.0:
Successfully uninstalled Pillow-8.4.0
Attempting uninstall: numpy
Found existing installation: numpy 1.19.1
Uninstalling numpy-1.19.1:
Successfully uninstalled numpy-1.19.1
Attempting uninstall: importlib-metadata
Found existing installation: importlib-metadata 4.8.2
Uninstalling importlib-metadata-4.8.2:
Successfully uninstalled importlib-metadata-4.8.2
Attempting uninstall: scipy
Found existing installation: scipy 1.4.1
Uninstalling scipy-1.4.1:
Successfully uninstalled scipy-1.4.1
Attempting uninstall: gluoncv
Found existing installation: gluoncv 0.8.0
Uninstalling gluoncv-0.8.0:
Successfully uninstalled gluoncv-0.8.0
Successfully installed Pillow-9.0.1 PyWavelets-1.3.0 absl-py-1.0.0 aiohttp-3.8.1 aiosignal-1.2.0 antlr4-python3-runtime-4.8 async-timeout-4.0.2 asynctest-0.13.0 autocfg-0.0.8 autogluon-0.4.0 autogluon-contrib-nlp-0.0.1b20220208 autogluon.common-0.4.0 autogluon.core-0.4.0 autogluon.features-0.4.0 autogluon.tabular-0.4.0 autogluon.text-0.4.0 autogluon.vision-0.4.0 blis-0.7.6 cachetools-5.0.0 catalogue-2.0.6 catboost-1.0.4 charset-normalizer-2.0.12 click-8.0.4 contextvars-2.4 cymem-2.0.6 dask-2021.11.2 deprecated-1.2.13 distributed-2021.11.2 fairscale-0.4.6 fastai-2.5.3 fastcore-1.3.29 fastdownload-0.0.5 fastprogress-1.0.2 filelock-3.6.0 flake8-4.0.1 frozenlist-1.3.0 future-0.18.2 gluoncv-0.10.5 google-auth-2.6.2 google-auth-oauthlib-0.4.6 grpcio-1.44.0 heapdict-1.0.1 huggingface-hub-0.4.0 immutables-0.16 importlib-metadata-4.2.0 importlib-resources-5.4.0 jsonschema-4.4.0 langcodes-3.3.0 lightgbm-3.3.2 locket-0.2.1 markdown-3.3.4 mccabe-0.6.1 msgpack-1.0.3 multidict-6.0.2 murmurhash-1.0.6 nptyping-1.4.4 numpy-1.21.5 oauthlib-3.2.0 omegaconf-2.1.1 partd-1.2.0 pathy-0.6.1 preshed-3.0.6 pyDeprecate-0.3.1 pyasn1-modules-0.2.8 pycodestyle-2.8.0 pydantic-1.8.2 pyflakes-2.4.0 pyrsistent-0.18.1 pytorch-lightning-1.5.10 ray-1.8.0 redis-4.1.4 regex-2022.3.15 requests-oauthlib-1.3.1 sacrebleu-2.0.0 sacremoses-0.0.49 scikit-image-0.19.2 scipy-1.7.3 sentencepiece-0.1.95 setuptools-59.5.0 smart-open-5.2.1 sortedcontainers-2.4.0 spacy-3.2.3 spacy-legacy-3.0.9 spacy-loggers-1.0.1 srsly-2.4.2 tblib-1.7.0 tensorboard-2.8.0 tensorboard-data-server-0.6.1 tensorboard-plugin-wit-1.8.1 thinc-8.0.15 tifffile-2021.11.2 timm-0.5.4 tokenizers-0.11.6 toolz-0.11.2 torch-1.10.2 torchmetrics-0.7.2 torchvision-0.11.3 tqdm-4.63.0 transformers-4.16.2 typer-0.4.0 typing-extensions-3.10.0.2 typish-1.9.3 wasabi-0.9.0 wrapt-1.14.0 xgboost-1.4.2 yacs-0.1.8 yarl-1.7.2 zict-2.1.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Collecting kaggle
Using cached kaggle-1.5.12-py3-none-any.whl
Requirement already satisfied: six>=1.10 in /usr/local/lib/python3.7/site-packages (from kaggle) (1.16.0)
Requirement already satisfied: tqdm in /usr/local/lib/python3.7/site-packages (from kaggle) (4.63.0)
Requirement already satisfied: requests in /usr/local/lib/python3.7/site-packages (from kaggle) (2.22.0)
Requirement already satisfied: python-dateutil in /usr/local/lib/python3.7/site-packages (from kaggle) (2.8.2)
Requirement already satisfied: urllib3 in /usr/local/lib/python3.7/site-packages (from kaggle) (1.25.11)
Collecting python-slugify
Using cached python_slugify-6.1.1-py2.py3-none-any.whl (9.1 kB)
Requirement already satisfied: certifi in /usr/local/lib/python3.7/site-packages (from kaggle) (2021.10.8)
Collecting text-unidecode>=1.3
Using cached text_unidecode-1.3-py2.py3-none-any.whl (78 kB)
Requirement already satisfied: idna<2.9,>=2.5 in /usr/local/lib/python3.7/site-packages (from requests->kaggle) (2.8)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /usr/local/lib/python3.7/site-packages (from requests->kaggle) (3.0.4)
Installing collected packages: text-unidecode, python-slugify, kaggle
Successfully installed kaggle-1.5.12 python-slugify-6.1.1 text-unidecode-1.3
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Collecting pandas-profiling
Using cached pandas_profiling-3.1.0-py2.py3-none-any.whl (261 kB)
Collecting requests>=2.24.0
Using cached requests-2.27.1-py2.py3-none-any.whl (63 kB)
Requirement already satisfied: markupsafe~=2.0.1 in /usr/local/lib/python3.7/site-packages (from pandas-profiling) (2.0.1)
Requirement already satisfied: pydantic>=1.8.1 in /usr/local/lib/python3.7/site-packages (from pandas-profiling) (1.8.2)
Requirement already satisfied: numpy>=1.16.0 in /usr/local/lib/python3.7/site-packages (from pandas-profiling) (1.21.5)
Requirement already satisfied: PyYAML>=5.0.0 in /usr/local/lib/python3.7/site-packages (from pandas-profiling) (5.4.1)
Requirement already satisfied: jinja2>=2.11.1 in /usr/local/lib/python3.7/site-packages (from pandas-profiling) (3.0.3)
Collecting htmlmin>=0.1.12
Using cached htmlmin-0.1.12-py3-none-any.whl
Collecting missingno>=0.4.2
Using cached missingno-0.5.1-py3-none-any.whl (8.7 kB)
Collecting multimethod>=1.4
Using cached multimethod-1.7-py3-none-any.whl (9.5 kB)
Collecting tangled-up-in-unicode==0.1.0
Using cached tangled_up_in_unicode-0.1.0-py3-none-any.whl (3.1 MB)
Collecting phik>=0.11.1
Using cached phik-0.12.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (687 kB)
Requirement already satisfied: tqdm>=4.48.2 in /usr/local/lib/python3.7/site-packages (from pandas-profiling) (4.63.0)
Requirement already satisfied: scipy>=1.4.1 in /usr/local/lib/python3.7/site-packages (from pandas-profiling) (1.7.3)
Requirement already satisfied: matplotlib>=3.2.0 in /usr/local/lib/python3.7/site-packages (from pandas-profiling) (3.5.0)
Collecting visions[type_image_path]==0.7.4
Using cached visions-0.7.4-py3-none-any.whl (102 kB)
Collecting joblib~=1.0.1
Using cached joblib-1.0.1-py3-none-any.whl (303 kB)
Requirement already satisfied: pandas!=1.0.0,!=1.0.1,!=1.0.2,!=1.1.0,>=0.25.3 in /usr/local/lib/python3.7/site-packages (from pandas-profiling) (1.3.4)
Requirement already satisfied: seaborn>=0.10.1 in /usr/local/lib/python3.7/site-packages (from pandas-profiling) (0.11.2)
Requirement already satisfied: attrs>=19.3.0 in /usr/local/lib/python3.7/site-packages (from visions[type_image_path]==0.7.4->pandas-profiling) (21.2.0)
Requirement already satisfied: networkx>=2.4 in /usr/local/lib/python3.7/site-packages (from visions[type_image_path]==0.7.4->pandas-profiling) (2.6.3)
Collecting imagehash
Using cached ImageHash-4.2.1-py2.py3-none-any.whl
Requirement already satisfied: Pillow in /usr/local/lib/python3.7/site-packages (from visions[type_image_path]==0.7.4->pandas-profiling) (9.0.1)
Requirement already satisfied: setuptools-scm>=4 in /usr/local/lib/python3.7/site-packages (from matplotlib>=3.2.0->pandas-profiling) (6.3.2)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.7/site-packages (from matplotlib>=3.2.0->pandas-profiling) (21.3)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.7/site-packages (from matplotlib>=3.2.0->pandas-profiling) (4.28.2)
Requirement already satisfied: pyparsing>=2.2.1 in /usr/local/lib/python3.7/site-packages (from matplotlib>=3.2.0->pandas-profiling) (3.0.6)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/site-packages (from matplotlib>=3.2.0->pandas-profiling) (1.3.2)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.7/site-packages (from matplotlib>=3.2.0->pandas-profiling) (2.8.2)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/site-packages (from matplotlib>=3.2.0->pandas-profiling) (0.11.0)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/site-packages (from pandas!=1.0.0,!=1.0.1,!=1.0.2,!=1.1.0,>=0.25.3->pandas-profiling) (2021.3)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.7/site-packages (from pydantic>=1.8.1->pandas-profiling) (3.10.0.2)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/site-packages (from requests>=2.24.0->pandas-profiling) (2021.10.8)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.7/site-packages (from requests>=2.24.0->pandas-profiling) (1.25.11)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.7/site-packages (from requests>=2.24.0->pandas-profiling) (2.8)
Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.7/site-packages (from requests>=2.24.0->pandas-profiling) (2.0.12)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/site-packages (from python-dateutil>=2.7->matplotlib>=3.2.0->pandas-profiling) (1.16.0)
Requirement already satisfied: setuptools in /usr/local/lib/python3.7/site-packages (from setuptools-scm>=4->matplotlib>=3.2.0->pandas-profiling) (59.5.0)
Requirement already satisfied: tomli>=1.0.0 in /usr/local/lib/python3.7/site-packages (from setuptools-scm>=4->matplotlib>=3.2.0->pandas-profiling) (1.2.2)
Requirement already satisfied: PyWavelets in /usr/local/lib/python3.7/site-packages (from imagehash->visions[type_image_path]==0.7.4->pandas-profiling) (1.3.0)
Installing collected packages: htmlmin, tangled-up-in-unicode, requests, multimethod, joblib, imagehash, visions, phik, missingno, pandas-profiling
Attempting uninstall: requests
Found existing installation: requests 2.22.0
Uninstalling requests-2.22.0:
Successfully uninstalled requests-2.22.0
Attempting uninstall: joblib
Found existing installation: joblib 1.1.0
Uninstalling joblib-1.1.0:
Successfully uninstalled joblib-1.1.0
Successfully installed htmlmin-0.1.12 imagehash-4.2.1 joblib-1.0.1 missingno-0.5.1 multimethod-1.7 pandas-profiling-3.1.0 phik-0.12.1 requests-2.27.1 tangled-up-in-unicode-0.1.0 visions-0.7.4
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Collecting ipywidgets
Using cached ipywidgets-7.7.0-py2.py3-none-any.whl (123 kB)
Requirement already satisfied: ipython>=4.0.0 in /usr/local/lib/python3.7/site-packages (from ipywidgets) (7.16.3)
Requirement already satisfied: ipykernel>=4.5.1 in /usr/local/lib/python3.7/site-packages (from ipywidgets) (5.5.6)
Collecting widgetsnbextension~=3.6.0
Using cached widgetsnbextension-3.6.0-py2.py3-none-any.whl (1.6 MB)
Requirement already satisfied: ipython-genutils~=0.2.0 in /usr/local/lib/python3.7/site-packages (from ipywidgets) (0.2.0)
Requirement already satisfied: traitlets>=4.3.1 in /usr/local/lib/python3.7/site-packages (from ipywidgets) (4.3.3)
Collecting nbformat>=4.2.0
Using cached nbformat-5.2.0-py3-none-any.whl (74 kB)
Collecting jupyterlab-widgets>=1.0.0
Using cached jupyterlab_widgets-1.1.0-py3-none-any.whl (245 kB)
Requirement already satisfied: jupyter-client in /usr/local/lib/python3.7/site-packages (from ipykernel>=4.5.1->ipywidgets) (7.1.2)
Requirement already satisfied: tornado>=4.2 in /usr/local/lib/python3.7/site-packages (from ipykernel>=4.5.1->ipywidgets) (6.1)
Requirement already satisfied: decorator in /usr/local/lib/python3.7/site-packages (from ipython>=4.0.0->ipywidgets) (5.1.1)
Requirement already satisfied: pygments in /usr/local/lib/python3.7/site-packages (from ipython>=4.0.0->ipywidgets) (2.11.2)
Requirement already satisfied: backcall in /usr/local/lib/python3.7/site-packages (from ipython>=4.0.0->ipywidgets) (0.2.0)
Requirement already satisfied: jedi<=0.17.2,>=0.10 in /usr/local/lib/python3.7/site-packages (from ipython>=4.0.0->ipywidgets) (0.17.2)
Requirement already satisfied: setuptools>=18.5 in /usr/local/lib/python3.7/site-packages (from ipython>=4.0.0->ipywidgets) (59.5.0)
Requirement already satisfied: pexpect in /usr/local/lib/python3.7/site-packages (from ipython>=4.0.0->ipywidgets) (4.8.0)
Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /usr/local/lib/python3.7/site-packages (from ipython>=4.0.0->ipywidgets) (3.0.28)
Requirement already satisfied: pickleshare in /usr/local/lib/python3.7/site-packages (from ipython>=4.0.0->ipywidgets) (0.7.5)
Requirement already satisfied: jupyter-core in /usr/local/lib/python3.7/site-packages (from nbformat>=4.2.0->ipywidgets) (4.9.2)
Requirement already satisfied: jsonschema!=2.5.0,>=2.4 in /usr/local/lib/python3.7/site-packages (from nbformat>=4.2.0->ipywidgets) (4.4.0)
Requirement already satisfied: six in /usr/local/lib/python3.7/site-packages (from traitlets>=4.3.1->ipywidgets) (1.16.0)
Collecting notebook>=4.4.1
Using cached notebook-6.4.10-py3-none-any.whl (9.9 MB)
Requirement already satisfied: parso<0.8.0,>=0.7.0 in /usr/local/lib/python3.7/site-packages (from jedi<=0.17.2,>=0.10->ipython>=4.0.0->ipywidgets) (0.7.1)
Requirement already satisfied: importlib-resources>=1.4.0 in /usr/local/lib/python3.7/site-packages (from jsonschema!=2.5.0,>=2.4->nbformat>=4.2.0->ipywidgets) (5.4.0)
Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.7/site-packages (from jsonschema!=2.5.0,>=2.4->nbformat>=4.2.0->ipywidgets) (4.2.0)
Requirement already satisfied: attrs>=17.4.0 in /usr/local/lib/python3.7/site-packages (from jsonschema!=2.5.0,>=2.4->nbformat>=4.2.0->ipywidgets) (21.2.0)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/site-packages (from jsonschema!=2.5.0,>=2.4->nbformat>=4.2.0->ipywidgets) (3.10.0.2)
Requirement already satisfied: pyrsistent!=0.17.0,!=0.17.1,!=0.17.2,>=0.14.0 in /usr/local/lib/python3.7/site-packages (from jsonschema!=2.5.0,>=2.4->nbformat>=4.2.0->ipywidgets) (0.18.1)
Collecting Send2Trash>=1.8.0
Using cached Send2Trash-1.8.0-py3-none-any.whl (18 kB)
Collecting nbconvert>=5
Using cached nbconvert-6.4.4-py3-none-any.whl (561 kB)
Requirement already satisfied: pyzmq>=17 in /usr/local/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets) (22.3.0)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets) (3.0.3)
Collecting argon2-cffi
Using cached argon2_cffi-21.3.0-py3-none-any.whl (14 kB)
Requirement already satisfied: nest-asyncio>=1.5 in /usr/local/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets) (1.5.4)
Collecting prometheus-client
Using cached prometheus_client-0.13.1-py3-none-any.whl (57 kB)
Collecting terminado>=0.8.3
Using cached terminado-0.13.3-py3-none-any.whl (14 kB)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/site-packages (from jupyter-client->ipykernel>=4.5.1->ipywidgets) (2.8.2)
Requirement already satisfied: entrypoints in /usr/local/lib/python3.7/site-packages (from jupyter-client->ipykernel>=4.5.1->ipywidgets) (0.4)
Requirement already satisfied: wcwidth in /usr/local/lib/python3.7/site-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython>=4.0.0->ipywidgets) (0.2.5)
Requirement already satisfied: ptyprocess>=0.5 in /usr/local/lib/python3.7/site-packages (from pexpect->ipython>=4.0.0->ipywidgets) (0.7.0)
Requirement already satisfied: zipp>=3.1.0 in /usr/local/lib/python3.7/site-packages (from importlib-resources>=1.4.0->jsonschema!=2.5.0,>=2.4->nbformat>=4.2.0->ipywidgets) (3.6.0)
Collecting nbclient<0.6.0,>=0.5.0
Using cached nbclient-0.5.13-py3-none-any.whl (70 kB)
Collecting jupyterlab-pygments
Using cached jupyterlab_pygments-0.1.2-py2.py3-none-any.whl (4.6 kB)
Collecting defusedxml
Using cached defusedxml-0.7.1-py2.py3-none-any.whl (25 kB)
Collecting beautifulsoup4
Using cached beautifulsoup4-4.10.0-py3-none-any.whl (97 kB)
Collecting traitlets>=4.3.1
Using cached traitlets-5.1.1-py3-none-any.whl (102 kB)
Collecting bleach
Using cached bleach-4.1.0-py2.py3-none-any.whl (157 kB)
Collecting testpath
Using cached testpath-0.6.0-py3-none-any.whl (83 kB)
Collecting mistune<2,>=0.8.1
Using cached mistune-0.8.4-py2.py3-none-any.whl (16 kB)
Collecting pandocfilters>=1.4.1
Using cached pandocfilters-1.5.0-py2.py3-none-any.whl (8.7 kB)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.7/site-packages (from jinja2->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets) (2.0.1)
Collecting argon2-cffi-bindings
Using cached argon2_cffi_bindings-21.2.0-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (86 kB)
Requirement already satisfied: cffi>=1.0.1 in /usr/local/lib/python3.7/site-packages (from argon2-cffi-bindings->argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets) (1.15.0)
Collecting soupsieve>1.2
Using cached soupsieve-2.3.1-py3-none-any.whl (37 kB)
Collecting webencodings
Using cached webencodings-0.5.1-py2.py3-none-any.whl (11 kB)
Requirement already satisfied: packaging in /usr/local/lib/python3.7/site-packages (from bleach->nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets) (21.3)
Requirement already satisfied: pycparser in /usr/local/lib/python3.7/site-packages (from cffi>=1.0.1->argon2-cffi-bindings->argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets) (2.21)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.7/site-packages (from packaging->bleach->nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets) (3.0.6)
Installing collected packages: webencodings, Send2Trash, mistune, traitlets, testpath, terminado, soupsieve, prometheus-client, pandocfilters, jupyterlab-widgets, jupyterlab-pygments, defusedxml, bleach, beautifulsoup4, argon2-cffi-bindings, nbformat, argon2-cffi, nbclient, nbconvert, notebook, widgetsnbextension, ipywidgets
Attempting uninstall: traitlets
Found existing installation: traitlets 4.3.3
Uninstalling traitlets-4.3.3:
Successfully uninstalled traitlets-4.3.3
Successfully installed Send2Trash-1.8.0 argon2-cffi-21.3.0 argon2-cffi-bindings-21.2.0 beautifulsoup4-4.10.0 bleach-4.1.0 defusedxml-0.7.1 ipywidgets-7.7.0 jupyterlab-pygments-0.1.2 jupyterlab-widgets-1.1.0 mistune-0.8.4 nbclient-0.5.13 nbconvert-6.4.4 nbformat-5.2.0 notebook-6.4.10 pandocfilters-1.5.0 prometheus-client-0.13.1 soupsieve-2.3.1 terminado-0.13.3 testpath-0.6.0 traitlets-5.1.1 webencodings-0.5.1 widgetsnbextension-3.6.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
# create the .kaggle directory and an empty kaggle.json file
!mkdir -p /root/.kaggle
!touch /root/.kaggle/kaggle.json
!chmod 600 /root/.kaggle/kaggle.json
The code below will add your Kaggle user name and token to the kaggle.json file so that Kaggle can be accessed when using this notebook. Note after running this I've removed my Kaggle key to prevent others from using it. I've also commented out the code block so I don't accidentally overwrite the file.
# Fill in your user name and key from creating the kaggle account and API token file
# import json
# kaggle_username = "robsmith155"
# kaggle_key = "KAGGLE_KEY"
# # Save API token the kaggle.json file
# with open("/root/.kaggle/kaggle.json", "w") as f:
# f.write(json.dumps({"username": kaggle_username, "key": kaggle_key}))
# Download the dataset, it will be in a .zip file so you'll need to unzip it as well.
!kaggle competitions download -c bike-sharing-demand
# If you already downloaded it you can use the -o command to overwrite the file
!unzip -o bike-sharing-demand.zip
Downloading bike-sharing-demand.zip to /root/udacity-nd-aws-ml-engineer-project1 0%| | 0.00/189k [00:00<?, ?B/s] 100%|████████████████████████████████████████| 189k/189k [00:00<00:00, 5.38MB/s] Archive: bike-sharing-demand.zip inflating: sampleSubmission.csv inflating: test.csv inflating: train.csv
import pandas as pd
from pandas_profiling import ProfileReport
from autogluon.tabular import TabularPredictor
# Create the train dataset in pandas by reading the csv
# Set the parsing of the datetime column so you can use some of the `dt` features in pandas later
train = pd.read_csv('train.csv', parse_dates=['datetime'])
train.head()
| datetime | season | holiday | workingday | weather | temp | atemp | humidity | windspeed | casual | registered | count | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2011-01-01 00:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 81 | 0.0 | 3 | 13 | 16 |
| 1 | 2011-01-01 01:00:00 | 1 | 0 | 0 | 1 | 9.02 | 13.635 | 80 | 0.0 | 8 | 32 | 40 |
| 2 | 2011-01-01 02:00:00 | 1 | 0 | 0 | 1 | 9.02 | 13.635 | 80 | 0.0 | 5 | 27 | 32 |
| 3 | 2011-01-01 03:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 75 | 0.0 | 3 | 10 | 13 |
| 4 | 2011-01-01 04:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 75 | 0.0 | 0 | 1 | 1 |
We can see that the data contains the following columns:
train.columns
Index(['datetime', 'season', 'holiday', 'workingday', 'weather', 'temp',
'atemp', 'humidity', 'windspeed', 'casual', 'registered', 'count'],
dtype='object')
From the Kaggle competition page, these correspond to:
datetime - hourly date + timestamp
season - 1 = spring, 2 = summer, 3 = fall, 4 = winter
holiday - whether the day is considered a holiday
workingday - whether the day is neither a weekend nor holiday
weather - 1: Clear, Few clouds, Partly cloudy, Partly cloudy
2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist
3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds
4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog
temp - temperature in Celsius
atemp - "feels like" temperature in Celsius
humidity - relative humidity
windspeed - wind speed
casual - number of non-registered user rentals initiated
registered - number of registered user rentals initiated
count - number of total rentals
We can see more details about the data using the info() and describe() methods:
train.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 10886 entries, 0 to 10885 Data columns (total 12 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 datetime 10886 non-null datetime64[ns] 1 season 10886 non-null int64 2 holiday 10886 non-null int64 3 workingday 10886 non-null int64 4 weather 10886 non-null int64 5 temp 10886 non-null float64 6 atemp 10886 non-null float64 7 humidity 10886 non-null int64 8 windspeed 10886 non-null float64 9 casual 10886 non-null int64 10 registered 10886 non-null int64 11 count 10886 non-null int64 dtypes: datetime64[ns](1), float64(3), int64(8) memory usage: 1020.7 KB
# Simple output of the train dataset to view some of the min/max/varition of the dataset features.
train.describe()
| season | holiday | workingday | weather | temp | atemp | humidity | windspeed | casual | registered | count | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 10886.000000 | 10886.000000 | 10886.000000 | 10886.000000 | 10886.00000 | 10886.000000 | 10886.000000 | 10886.000000 | 10886.000000 | 10886.000000 | 10886.000000 |
| mean | 2.506614 | 0.028569 | 0.680875 | 1.418427 | 20.23086 | 23.655084 | 61.886460 | 12.799395 | 36.021955 | 155.552177 | 191.574132 |
| std | 1.116174 | 0.166599 | 0.466159 | 0.633839 | 7.79159 | 8.474601 | 19.245033 | 8.164537 | 49.960477 | 151.039033 | 181.144454 |
| min | 1.000000 | 0.000000 | 0.000000 | 1.000000 | 0.82000 | 0.760000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000 |
| 25% | 2.000000 | 0.000000 | 0.000000 | 1.000000 | 13.94000 | 16.665000 | 47.000000 | 7.001500 | 4.000000 | 36.000000 | 42.000000 |
| 50% | 3.000000 | 0.000000 | 1.000000 | 1.000000 | 20.50000 | 24.240000 | 62.000000 | 12.998000 | 17.000000 | 118.000000 | 145.000000 |
| 75% | 4.000000 | 0.000000 | 1.000000 | 2.000000 | 26.24000 | 31.060000 | 77.000000 | 16.997900 | 49.000000 | 222.000000 | 284.000000 |
| max | 4.000000 | 1.000000 | 1.000000 | 4.000000 | 41.00000 | 45.455000 | 100.000000 | 56.996900 | 367.000000 | 886.000000 | 977.000000 |
# Create the test pandas dataframe in pandas by reading the csv, remember to parse the datetime!
test = pd.read_csv('test.csv', parse_dates=['datetime'])
test.head()
| datetime | season | holiday | workingday | weather | temp | atemp | humidity | windspeed | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 2011-01-20 00:00:00 | 1 | 0 | 1 | 1 | 10.66 | 11.365 | 56 | 26.0027 |
| 1 | 2011-01-20 01:00:00 | 1 | 0 | 1 | 1 | 10.66 | 13.635 | 56 | 0.0000 |
| 2 | 2011-01-20 02:00:00 | 1 | 0 | 1 | 1 | 10.66 | 13.635 | 56 | 0.0000 |
| 3 | 2011-01-20 03:00:00 | 1 | 0 | 1 | 1 | 10.66 | 12.880 | 56 | 11.0014 |
| 4 | 2011-01-20 04:00:00 | 1 | 0 | 1 | 1 | 10.66 | 12.880 | 56 | 11.0014 |
# Same thing as train and test dataset
submission = pd.read_csv('sampleSubmission.csv', parse_dates=['datetime'])
submission.head()
| datetime | count | |
|---|---|---|
| 0 | 2011-01-20 00:00:00 | 0 |
| 1 | 2011-01-20 01:00:00 | 0 |
| 2 | 2011-01-20 02:00:00 | 0 |
| 3 | 2011-01-20 03:00:00 | 0 |
| 4 | 2011-01-20 04:00:00 | 0 |
Requirements:
count, so it is the label we are setting.casual and registered columns as they are also not present in the test dataset. root_mean_squared_error as the metric to use for evaluation.best_quality to focus on creating the best model.predictor = TabularPredictor(
label='count',
eval_metric='root_mean_squared_error').fit(train, ignored_columns=['registered', 'casual'], time_limit=600, presets='best_quality')
No path specified. Models will be saved in: "AutogluonModels/ag-20220318_081916/"
Presets specified: ['best_quality']
Beginning AutoGluon training ... Time limit = 600s
AutoGluon will save models to "AutogluonModels/ag-20220318_081916/"
AutoGluon Version: 0.4.0
Python Version: 3.7.10
Operating System: Linux
Train Data Rows: 10886
Train Data Columns: 9
Label Column: count
Preprocessing data ...
AutoGluon infers your prediction problem is: 'regression' (because dtype of label-column == int and many unique label-values observed).
Label info (max, min, mean, stddev): (977, 1, 191.57413, 181.14445)
If 'regression' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 2890.02 MB
Train Data (Original) Memory Usage: 0.78 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 2 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting DatetimeFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('datetime', []) : 1 | ['datetime']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 5 | ['season', 'holiday', 'workingday', 'weather', 'humidity']
Types of features in processed data (raw dtype, special dtypes):
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 3 | ['season', 'weather', 'humidity']
('int', ['bool']) : 2 | ['holiday', 'workingday']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
0.3s = Fit runtime
9 features in original data used to generate 13 features in processed data.
Train Data (Processed) Memory Usage: 0.98 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.32s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
To change this, specify the eval_metric parameter of Predictor()
AutoGluon will fit 2 stack levels (L1 to L2) ...
Fitting 11 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 399.69s of the 599.68s of remaining time.
-101.5462 = Validation score (root_mean_squared_error)
0.03s = Training runtime
0.1s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 399.29s of the 599.28s of remaining time.
-84.1251 = Validation score (root_mean_squared_error)
0.03s = Training runtime
0.11s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 398.9s of the 598.89s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
2022-03-18 08:19:22,151 WARNING services.py:1758 -- WARNING: The object store is using /tmp instead of /dev/shm because /dev/shm has only 67108864 bytes available. This will harm performance! You may be able to free up space by deleting files in /dev/shm. If you are inside a Docker container, you can increase /dev/shm size by passing '--shm-size=0.92gb' to 'docker run' (or add it to the run_options list in a Ray cluster config). Make sure to set this to more than 30% of available RAM.
-131.4609 = Validation score (root_mean_squared_error)
64.7s = Training runtime
6.39s = Validation runtime
Fitting model: LightGBM_BAG_L1 ... Training model for up to 323.66s of the 523.65s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-131.0542 = Validation score (root_mean_squared_error)
27.82s = Training runtime
1.41s = Validation runtime
Fitting model: RandomForestMSE_BAG_L1 ... Training model for up to 292.48s of the 492.47s of remaining time.
-116.6217 = Validation score (root_mean_squared_error)
9.53s = Training runtime
0.44s = Validation runtime
Fitting model: CatBoost_BAG_L1 ... Training model for up to 279.67s of the 479.66s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-130.5332 = Validation score (root_mean_squared_error)
205.27s = Training runtime
0.15s = Validation runtime
Fitting model: ExtraTreesMSE_BAG_L1 ... Training model for up to 70.66s of the 270.65s of remaining time.
-124.6372 = Validation score (root_mean_squared_error)
4.21s = Training runtime
0.44s = Validation runtime
Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 63.23s of the 263.22s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-137.3793 = Validation score (root_mean_squared_error)
70.74s = Training runtime
0.44s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L2 ... Training model for up to 360.0s of the 189.57s of remaining time.
-84.1251 = Validation score (root_mean_squared_error)
0.75s = Training runtime
0.0s = Validation runtime
Fitting 9 L2 models ...
Fitting model: LightGBMXT_BAG_L2 ... Training model for up to 188.74s of the 188.72s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-60.4946 = Validation score (root_mean_squared_error)
49.17s = Training runtime
2.91s = Validation runtime
Fitting model: LightGBM_BAG_L2 ... Training model for up to 136.29s of the 136.26s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-55.0949 = Validation score (root_mean_squared_error)
21.74s = Training runtime
0.27s = Validation runtime
Fitting model: RandomForestMSE_BAG_L2 ... Training model for up to 111.13s of the 111.11s of remaining time.
-53.4252 = Validation score (root_mean_squared_error)
25.61s = Training runtime
0.5s = Validation runtime
Fitting model: CatBoost_BAG_L2 ... Training model for up to 82.35s of the 82.33s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-55.5012 = Validation score (root_mean_squared_error)
72.16s = Training runtime
0.07s = Validation runtime
Fitting model: ExtraTreesMSE_BAG_L2 ... Training model for up to 7.6s of the 7.59s of remaining time.
-53.7698 = Validation score (root_mean_squared_error)
7.2s = Training runtime
0.5s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L3 ... Training model for up to 360.0s of the -2.9s of remaining time.
-52.8146 = Validation score (root_mean_squared_error)
0.47s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 603.61s ... Best model: "WeightedEnsemble_L3"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20220318_081916/")
predictor.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
model score_val pred_time_val fit_time pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L3 -52.814584 10.829041 509.522665 0.000806 0.465734 3 True 15
1 RandomForestMSE_BAG_L2 -53.425203 9.990716 407.954854 0.503733 25.608415 2 True 12
2 ExtraTreesMSE_BAG_L2 -53.769813 9.985478 389.545316 0.498495 7.198876 2 True 14
3 LightGBM_BAG_L2 -55.094927 9.754041 404.086212 0.267058 21.739773 2 True 11
4 CatBoost_BAG_L2 -55.501246 9.558949 454.509867 0.071966 72.163428 2 True 13
5 LightGBMXT_BAG_L2 -60.494559 12.393334 431.512245 2.906351 49.165806 2 True 10
6 KNeighborsDist_BAG_L1 -84.125061 0.105933 0.031794 0.105933 0.031794 1 True 2
7 WeightedEnsemble_L2 -84.125061 0.107082 0.781110 0.001149 0.749316 2 True 9
8 KNeighborsUnif_BAG_L1 -101.546199 0.102926 0.033631 0.102926 0.033631 1 True 1
9 RandomForestMSE_BAG_L1 -116.621736 0.441193 9.532506 0.441193 9.532506 1 True 5
10 ExtraTreesMSE_BAG_L1 -124.637158 0.442806 4.213555 0.442806 4.213555 1 True 7
11 CatBoost_BAG_L1 -130.533194 0.148776 205.271412 0.148776 205.271412 1 True 6
12 LightGBM_BAG_L1 -131.054162 1.411309 27.822397 1.411309 27.822397 1 True 4
13 LightGBMXT_BAG_L1 -131.460909 6.393110 64.699227 6.393110 64.699227 1 True 3
14 NeuralNetFastAI_BAG_L1 -137.379340 0.440931 70.741918 0.440931 70.741918 1 True 8
Number of models trained: 15
Types of models trained:
{'StackerEnsembleModel_RF', 'StackerEnsembleModel_NNFastAiTabular', 'StackerEnsembleModel_XT', 'WeightedEnsembleModel', 'StackerEnsembleModel_KNN', 'StackerEnsembleModel_CatBoost', 'StackerEnsembleModel_LGB'}
Bagging used: True (with 8 folds)
Multi-layer stack-ensembling used: True (with 3 levels)
Feature Metadata (Processed):
(raw dtype, special dtypes):
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 3 | ['season', 'weather', 'humidity']
('int', ['bool']) : 2 | ['holiday', 'workingday']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
Plot summary of models saved to file: AutogluonModels/ag-20220318_081916/SummaryOfModels.html
*** End of fit() summary ***
{'model_types': {'KNeighborsUnif_BAG_L1': 'StackerEnsembleModel_KNN',
'KNeighborsDist_BAG_L1': 'StackerEnsembleModel_KNN',
'LightGBMXT_BAG_L1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L1': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L1': 'StackerEnsembleModel_CatBoost',
'ExtraTreesMSE_BAG_L1': 'StackerEnsembleModel_XT',
'NeuralNetFastAI_BAG_L1': 'StackerEnsembleModel_NNFastAiTabular',
'WeightedEnsemble_L2': 'WeightedEnsembleModel',
'LightGBMXT_BAG_L2': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L2': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L2': 'StackerEnsembleModel_CatBoost',
'ExtraTreesMSE_BAG_L2': 'StackerEnsembleModel_XT',
'WeightedEnsemble_L3': 'WeightedEnsembleModel'},
'model_performance': {'KNeighborsUnif_BAG_L1': -101.54619908446061,
'KNeighborsDist_BAG_L1': -84.12506123181602,
'LightGBMXT_BAG_L1': -131.46090891834504,
'LightGBM_BAG_L1': -131.054161598899,
'RandomForestMSE_BAG_L1': -116.62173601727898,
'CatBoost_BAG_L1': -130.5331939673838,
'ExtraTreesMSE_BAG_L1': -124.63715787314163,
'NeuralNetFastAI_BAG_L1': -137.3793400371632,
'WeightedEnsemble_L2': -84.12506123181602,
'LightGBMXT_BAG_L2': -60.49455865102558,
'LightGBM_BAG_L2': -55.0949274412813,
'RandomForestMSE_BAG_L2': -53.425203372378355,
'CatBoost_BAG_L2': -55.50124631019139,
'ExtraTreesMSE_BAG_L2': -53.76981312304573,
'WeightedEnsemble_L3': -52.814584128954735},
'model_best': 'WeightedEnsemble_L3',
'model_paths': {'KNeighborsUnif_BAG_L1': 'AutogluonModels/ag-20220318_081916/models/KNeighborsUnif_BAG_L1/',
'KNeighborsDist_BAG_L1': 'AutogluonModels/ag-20220318_081916/models/KNeighborsDist_BAG_L1/',
'LightGBMXT_BAG_L1': 'AutogluonModels/ag-20220318_081916/models/LightGBMXT_BAG_L1/',
'LightGBM_BAG_L1': 'AutogluonModels/ag-20220318_081916/models/LightGBM_BAG_L1/',
'RandomForestMSE_BAG_L1': 'AutogluonModels/ag-20220318_081916/models/RandomForestMSE_BAG_L1/',
'CatBoost_BAG_L1': 'AutogluonModels/ag-20220318_081916/models/CatBoost_BAG_L1/',
'ExtraTreesMSE_BAG_L1': 'AutogluonModels/ag-20220318_081916/models/ExtraTreesMSE_BAG_L1/',
'NeuralNetFastAI_BAG_L1': 'AutogluonModels/ag-20220318_081916/models/NeuralNetFastAI_BAG_L1/',
'WeightedEnsemble_L2': 'AutogluonModels/ag-20220318_081916/models/WeightedEnsemble_L2/',
'LightGBMXT_BAG_L2': 'AutogluonModels/ag-20220318_081916/models/LightGBMXT_BAG_L2/',
'LightGBM_BAG_L2': 'AutogluonModels/ag-20220318_081916/models/LightGBM_BAG_L2/',
'RandomForestMSE_BAG_L2': 'AutogluonModels/ag-20220318_081916/models/RandomForestMSE_BAG_L2/',
'CatBoost_BAG_L2': 'AutogluonModels/ag-20220318_081916/models/CatBoost_BAG_L2/',
'ExtraTreesMSE_BAG_L2': 'AutogluonModels/ag-20220318_081916/models/ExtraTreesMSE_BAG_L2/',
'WeightedEnsemble_L3': 'AutogluonModels/ag-20220318_081916/models/WeightedEnsemble_L3/'},
'model_fit_times': {'KNeighborsUnif_BAG_L1': 0.033631324768066406,
'KNeighborsDist_BAG_L1': 0.03179359436035156,
'LightGBMXT_BAG_L1': 64.69922685623169,
'LightGBM_BAG_L1': 27.822396993637085,
'RandomForestMSE_BAG_L1': 9.532505989074707,
'CatBoost_BAG_L1': 205.27141165733337,
'ExtraTreesMSE_BAG_L1': 4.213555097579956,
'NeuralNetFastAI_BAG_L1': 70.74191808700562,
'WeightedEnsemble_L2': 0.7493164539337158,
'LightGBMXT_BAG_L2': 49.16580557823181,
'LightGBM_BAG_L2': 21.73977255821228,
'RandomForestMSE_BAG_L2': 25.60841464996338,
'CatBoost_BAG_L2': 72.16342759132385,
'ExtraTreesMSE_BAG_L2': 7.198876142501831,
'WeightedEnsemble_L3': 0.46573400497436523},
'model_pred_times': {'KNeighborsUnif_BAG_L1': 0.10292601585388184,
'KNeighborsDist_BAG_L1': 0.10593271255493164,
'LightGBMXT_BAG_L1': 6.393110036849976,
'LightGBM_BAG_L1': 1.411308765411377,
'RandomForestMSE_BAG_L1': 0.4411928653717041,
'CatBoost_BAG_L1': 0.14877605438232422,
'ExtraTreesMSE_BAG_L1': 0.4428060054779053,
'NeuralNetFastAI_BAG_L1': 0.4409306049346924,
'WeightedEnsemble_L2': 0.0011491775512695312,
'LightGBMXT_BAG_L2': 2.90635085105896,
'LightGBM_BAG_L2': 0.2670576572418213,
'RandomForestMSE_BAG_L2': 0.5037333965301514,
'CatBoost_BAG_L2': 0.07196617126464844,
'ExtraTreesMSE_BAG_L2': 0.49849510192871094,
'WeightedEnsemble_L3': 0.0008056163787841797},
'num_bag_folds': 8,
'max_stack_level': 3,
'model_hyperparams': {'KNeighborsUnif_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'KNeighborsDist_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'LightGBMXT_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'ExtraTreesMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'NeuralNetFastAI_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L2': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBMXT_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'ExtraTreesMSE_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'WeightedEnsemble_L3': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True}},
'leaderboard': model score_val pred_time_val fit_time \
0 WeightedEnsemble_L3 -52.814584 10.829041 509.522665
1 RandomForestMSE_BAG_L2 -53.425203 9.990716 407.954854
2 ExtraTreesMSE_BAG_L2 -53.769813 9.985478 389.545316
3 LightGBM_BAG_L2 -55.094927 9.754041 404.086212
4 CatBoost_BAG_L2 -55.501246 9.558949 454.509867
5 LightGBMXT_BAG_L2 -60.494559 12.393334 431.512245
6 KNeighborsDist_BAG_L1 -84.125061 0.105933 0.031794
7 WeightedEnsemble_L2 -84.125061 0.107082 0.781110
8 KNeighborsUnif_BAG_L1 -101.546199 0.102926 0.033631
9 RandomForestMSE_BAG_L1 -116.621736 0.441193 9.532506
10 ExtraTreesMSE_BAG_L1 -124.637158 0.442806 4.213555
11 CatBoost_BAG_L1 -130.533194 0.148776 205.271412
12 LightGBM_BAG_L1 -131.054162 1.411309 27.822397
13 LightGBMXT_BAG_L1 -131.460909 6.393110 64.699227
14 NeuralNetFastAI_BAG_L1 -137.379340 0.440931 70.741918
pred_time_val_marginal fit_time_marginal stack_level can_infer \
0 0.000806 0.465734 3 True
1 0.503733 25.608415 2 True
2 0.498495 7.198876 2 True
3 0.267058 21.739773 2 True
4 0.071966 72.163428 2 True
5 2.906351 49.165806 2 True
6 0.105933 0.031794 1 True
7 0.001149 0.749316 2 True
8 0.102926 0.033631 1 True
9 0.441193 9.532506 1 True
10 0.442806 4.213555 1 True
11 0.148776 205.271412 1 True
12 1.411309 27.822397 1 True
13 6.393110 64.699227 1 True
14 0.440931 70.741918 1 True
fit_order
0 15
1 12
2 14
3 11
4 13
5 10
6 2
7 9
8 1
9 5
10 7
11 6
12 4
13 3
14 8 }
We can also use the leaderboard() method:
lboard = predictor.leaderboard()
lboard.sort_values(by='score_val', ascending=False)
model score_val pred_time_val fit_time pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order 0 WeightedEnsemble_L3 -52.814584 10.829041 509.522665 0.000806 0.465734 3 True 15 1 RandomForestMSE_BAG_L2 -53.425203 9.990716 407.954854 0.503733 25.608415 2 True 12 2 ExtraTreesMSE_BAG_L2 -53.769813 9.985478 389.545316 0.498495 7.198876 2 True 14 3 LightGBM_BAG_L2 -55.094927 9.754041 404.086212 0.267058 21.739773 2 True 11 4 CatBoost_BAG_L2 -55.501246 9.558949 454.509867 0.071966 72.163428 2 True 13 5 LightGBMXT_BAG_L2 -60.494559 12.393334 431.512245 2.906351 49.165806 2 True 10 6 KNeighborsDist_BAG_L1 -84.125061 0.105933 0.031794 0.105933 0.031794 1 True 2 7 WeightedEnsemble_L2 -84.125061 0.107082 0.781110 0.001149 0.749316 2 True 9 8 KNeighborsUnif_BAG_L1 -101.546199 0.102926 0.033631 0.102926 0.033631 1 True 1 9 RandomForestMSE_BAG_L1 -116.621736 0.441193 9.532506 0.441193 9.532506 1 True 5 10 ExtraTreesMSE_BAG_L1 -124.637158 0.442806 4.213555 0.442806 4.213555 1 True 7 11 CatBoost_BAG_L1 -130.533194 0.148776 205.271412 0.148776 205.271412 1 True 6 12 LightGBM_BAG_L1 -131.054162 1.411309 27.822397 1.411309 27.822397 1 True 4 13 LightGBMXT_BAG_L1 -131.460909 6.393110 64.699227 6.393110 64.699227 1 True 3 14 NeuralNetFastAI_BAG_L1 -137.379340 0.440931 70.741918 0.440931 70.741918 1 True 8
| model | score_val | pred_time_val | fit_time | pred_time_val_marginal | fit_time_marginal | stack_level | can_infer | fit_order | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | WeightedEnsemble_L3 | -52.814584 | 10.829041 | 509.522665 | 0.000806 | 0.465734 | 3 | True | 15 |
| 1 | RandomForestMSE_BAG_L2 | -53.425203 | 9.990716 | 407.954854 | 0.503733 | 25.608415 | 2 | True | 12 |
| 2 | ExtraTreesMSE_BAG_L2 | -53.769813 | 9.985478 | 389.545316 | 0.498495 | 7.198876 | 2 | True | 14 |
| 3 | LightGBM_BAG_L2 | -55.094927 | 9.754041 | 404.086212 | 0.267058 | 21.739773 | 2 | True | 11 |
| 4 | CatBoost_BAG_L2 | -55.501246 | 9.558949 | 454.509867 | 0.071966 | 72.163428 | 2 | True | 13 |
| 5 | LightGBMXT_BAG_L2 | -60.494559 | 12.393334 | 431.512245 | 2.906351 | 49.165806 | 2 | True | 10 |
| 6 | KNeighborsDist_BAG_L1 | -84.125061 | 0.105933 | 0.031794 | 0.105933 | 0.031794 | 1 | True | 2 |
| 7 | WeightedEnsemble_L2 | -84.125061 | 0.107082 | 0.781110 | 0.001149 | 0.749316 | 2 | True | 9 |
| 8 | KNeighborsUnif_BAG_L1 | -101.546199 | 0.102926 | 0.033631 | 0.102926 | 0.033631 | 1 | True | 1 |
| 9 | RandomForestMSE_BAG_L1 | -116.621736 | 0.441193 | 9.532506 | 0.441193 | 9.532506 | 1 | True | 5 |
| 10 | ExtraTreesMSE_BAG_L1 | -124.637158 | 0.442806 | 4.213555 | 0.442806 | 4.213555 | 1 | True | 7 |
| 11 | CatBoost_BAG_L1 | -130.533194 | 0.148776 | 205.271412 | 0.148776 | 205.271412 | 1 | True | 6 |
| 12 | LightGBM_BAG_L1 | -131.054162 | 1.411309 | 27.822397 | 1.411309 | 27.822397 | 1 | True | 4 |
| 13 | LightGBMXT_BAG_L1 | -131.460909 | 6.393110 | 64.699227 | 6.393110 | 64.699227 | 1 | True | 3 |
| 14 | NeuralNetFastAI_BAG_L1 | -137.379340 | 0.440931 | 70.741918 | 0.440931 | 70.741918 | 1 | True | 8 |
predictions = predictor.predict(test)
predictions.head()
0 24.143776 1 41.916306 2 46.565208 3 48.943817 4 51.709141 Name: count, dtype: float32
# Describe the `predictions` series to see if there are any negative values
predictions.describe()
count 6493.000000 mean 100.891098 std 90.291023 min 2.896370 25% 20.751591 50% 63.525497 75% 170.049713 max 364.169312 Name: count, dtype: float64
# How many negative values do we have?
len(predictions[predictions<0])
0
# Set them to zero
predictions[predictions<0] = 0
submission["count"] = predictions
submission.to_csv("submission.csv", index=False)
!kaggle competitions submit -c bike-sharing-demand -f submission.csv -m "first raw submission"
100%|█████████████████████████████████████████| 188k/188k [00:00<00:00, 540kB/s] Successfully submitted to Bike Sharing Demand
My Submissions¶!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6
fileName date description status publicScore privateScore --------------------------- ------------------- --------------------------------- -------- ----------- ------------ submission.csv 2022-03-18 08:51:45 first raw submission complete 1.80430 1.80430 submission_new_hpo.csv 2022-03-17 19:53:54 new features with hyperparameters complete 0.45461 0.45461 submission_new_features.csv 2022-03-17 19:06:19 new features complete 0.67065 0.67065 submission.csv 2022-03-17 16:28:48 first raw submission complete 1.80373 1.80373
A very useful package for performing EDA on Pandas DataFrames is the pandas-profiling package.
from pandas_profiling import ProfileReport
report = ProfileReport(train)
report.to_notebook_iframe()
However, the outputs are only visible if you re-run the notebook.
# Create a histogram of all features to show the distribution of each one relative to the data. This is part of the exploritory data analysis
%matplotlib inline
train.hist(figsize=(20,20))
array([[<AxesSubplot:title={'center':'datetime'}>,
<AxesSubplot:title={'center':'season'}>,
<AxesSubplot:title={'center':'holiday'}>],
[<AxesSubplot:title={'center':'workingday'}>,
<AxesSubplot:title={'center':'weather'}>,
<AxesSubplot:title={'center':'temp'}>],
[<AxesSubplot:title={'center':'atemp'}>,
<AxesSubplot:title={'center':'humidity'}>,
<AxesSubplot:title={'center':'windspeed'}>],
[<AxesSubplot:title={'center':'casual'}>,
<AxesSubplot:title={'center':'registered'}>,
<AxesSubplot:title={'center':'count'}>]], dtype=object)
Below we use the Seaborn package to make a heatmap showing the correlation between the different features.
import seaborn as sns
sns.set(rc = {'figure.figsize':(15,8)})
hm = sns.heatmap(train.corr(), annot = True)
plt.savefig('./img/correlation_heatmap.png')
fig = train.set_index(train.datetime.dt.hour, append=True)['count'].unstack().plot.box(figsize=(10,5)).get_figure()
fig.savefig('./img/boxplot_hourly.png')
fig = train.set_index(train.datetime.dt.day, append=True)['count'].unstack().plot.box(figsize=(10,5)).get_figure()
fig.savefig('./img/boxplot_daily.png')
fig = train.set_index(train.datetime.dt.month, append=True)['count'].unstack().plot.box(figsize=(10,5)).get_figure()
fig.savefig('./img/boxplot_monthly.png')
We clearly observed trends in the data above when looking at monthly and hourly bike rentals. We can add features related to these to make it easier for the model to learn.
train.columns
Index(['datetime', 'season', 'holiday', 'workingday', 'weather', 'temp',
'atemp', 'humidity', 'windspeed', 'casual', 'registered', 'count'],
dtype='object')
# create a new feature
train['month'] = train['datetime'].dt.month
test['month'] = test['datetime'].dt.month
# create a new feature
train['hour'] = train['datetime'].dt.hour
test['hour'] = test['datetime'].dt.hour
We clearly say different demand at different times of day in the bxplot above. Lets create a feature that categories demand based on time (i.e. high demand during 7-9AM and 4-7PM rushhours).
import numpy as np
# create a list of our conditions
conditions = [
(train['hour'] <= 6),
(train['hour'] >6 ) & (train['hour'] <= 9),
(train['hour'] > 9) & (train['hour'] <= 15),
(train['hour'] > 15) & (train['hour'] <= 19),
(train['hour'] > 19) & (train['hour'] <=21),
(train['hour'] > 21)
]
# create a list of the values we want to assign for each condition
values = [0, 2, 1, 2, 1, 0]
# 0 = low
# 1 = Medium
# 2 = High
# create a new column and use np.select to assign values to it using our lists as arguments
train['demand'] = np.select(conditions, values)
# display updated DataFrame
train.head()
| datetime | season | holiday | workingday | weather | temp | atemp | humidity | windspeed | casual | registered | count | month | hour | demand | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2011-01-01 00:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 81 | 0.0 | 3 | 13 | 16 | 1 | 0 | 0 |
| 1 | 2011-01-01 01:00:00 | 1 | 0 | 0 | 1 | 9.02 | 13.635 | 80 | 0.0 | 8 | 32 | 40 | 1 | 1 | 0 |
| 2 | 2011-01-01 02:00:00 | 1 | 0 | 0 | 1 | 9.02 | 13.635 | 80 | 0.0 | 5 | 27 | 32 | 1 | 2 | 0 |
| 3 | 2011-01-01 03:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 75 | 0.0 | 3 | 10 | 13 | 1 | 3 | 0 |
| 4 | 2011-01-01 04:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 75 | 0.0 | 0 | 1 | 1 | 1 | 4 | 0 |
# create a list of our conditions
conditions = [
(test['hour'] <= 6),
(test['hour'] >6 ) & (test['hour'] <= 9),
(test['hour'] > 9) & (test['hour'] <= 15),
(test['hour'] > 15) & (test['hour'] <= 19),
(test['hour'] > 19) & (test['hour'] <=21),
(test['hour'] > 21)
]
# create a list of the values we want to assign for each condition
values = [0, 2, 1, 2, 1, 0]
# 0 = low
# 1 = Medium
# 2 = High
# create a new column and use np.select to assign values to it using our lists as arguments
test['demand'] = np.select(conditions, values)
# display updated DataFrame
test.head()
| datetime | season | holiday | workingday | weather | temp | atemp | humidity | windspeed | month | hour | demand | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2011-01-20 00:00:00 | 1 | 0 | 1 | 1 | 10.66 | 11.365 | 56 | 26.0027 | 1 | 0 | 0 |
| 1 | 2011-01-20 01:00:00 | 1 | 0 | 1 | 1 | 10.66 | 13.635 | 56 | 0.0000 | 1 | 1 | 0 |
| 2 | 2011-01-20 02:00:00 | 1 | 0 | 1 | 1 | 10.66 | 13.635 | 56 | 0.0000 | 1 | 2 | 0 |
| 3 | 2011-01-20 03:00:00 | 1 | 0 | 1 | 1 | 10.66 | 12.880 | 56 | 11.0014 | 1 | 3 | 0 |
| 4 | 2011-01-20 04:00:00 | 1 | 0 | 1 | 1 | 10.66 | 12.880 | 56 | 11.0014 | 1 | 4 | 0 |
train["season"] = train["season"].astype('category')
train["weather"] = train["weather"].astype('category')
train["demand"] = train["demand"].astype('category')
test["season"] = test["season"].astype('category')
test["weather"] = test["weather"].astype('category')
test["demand"] = test["demand"].astype('category')
train.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 10886 entries, 0 to 10885 Data columns (total 15 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 datetime 10886 non-null datetime64[ns] 1 season 10886 non-null category 2 holiday 10886 non-null int64 3 workingday 10886 non-null int64 4 weather 10886 non-null category 5 temp 10886 non-null float64 6 atemp 10886 non-null float64 7 humidity 10886 non-null int64 8 windspeed 10886 non-null float64 9 casual 10886 non-null int64 10 registered 10886 non-null int64 11 count 10886 non-null int64 12 month 10886 non-null int64 13 hour 10886 non-null int64 14 demand 10886 non-null category dtypes: category(3), datetime64[ns](1), float64(3), int64(8) memory usage: 1.0 MB
# View are new feature
train.head()
| datetime | season | holiday | workingday | weather | temp | atemp | humidity | windspeed | casual | registered | count | month | hour | demand | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2011-01-01 00:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 81 | 0.0 | 3 | 13 | 16 | 1 | 0 | 0 |
| 1 | 2011-01-01 01:00:00 | 1 | 0 | 0 | 1 | 9.02 | 13.635 | 80 | 0.0 | 8 | 32 | 40 | 1 | 1 | 0 |
| 2 | 2011-01-01 02:00:00 | 1 | 0 | 0 | 1 | 9.02 | 13.635 | 80 | 0.0 | 5 | 27 | 32 | 1 | 2 | 0 |
| 3 | 2011-01-01 03:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 75 | 0.0 | 3 | 10 | 13 | 1 | 3 | 0 |
| 4 | 2011-01-01 04:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 75 | 0.0 | 0 | 1 | 1 | 1 | 4 | 0 |
# View histogram of all features again now with the hour feature
train.hist(figsize=(20,20))
array([[<AxesSubplot:title={'center':'datetime'}>,
<AxesSubplot:title={'center':'holiday'}>,
<AxesSubplot:title={'center':'workingday'}>],
[<AxesSubplot:title={'center':'temp'}>,
<AxesSubplot:title={'center':'atemp'}>,
<AxesSubplot:title={'center':'humidity'}>],
[<AxesSubplot:title={'center':'windspeed'}>,
<AxesSubplot:title={'center':'casual'}>,
<AxesSubplot:title={'center':'registered'}>],
[<AxesSubplot:title={'center':'count'}>,
<AxesSubplot:title={'center':'month'}>,
<AxesSubplot:title={'center':'hour'}>]], dtype=object)
train['demand'].hist()
<AxesSubplot:>
predictor_new_features = TabularPredictor(
label='count',
eval_metric='root_mean_squared_error').fit(train, ignored_columns=['registered', 'casual'], time_limit=600, presets='best_quality')
No path specified. Models will be saved in: "AutogluonModels/ag-20220317_184828/"
Presets specified: ['best_quality']
Beginning AutoGluon training ... Time limit = 600s
AutoGluon will save models to "AutogluonModels/ag-20220317_184828/"
AutoGluon Version: 0.4.0
Python Version: 3.7.10
Operating System: Linux
Train Data Rows: 10886
Train Data Columns: 12
Label Column: count
Preprocessing data ...
AutoGluon infers your prediction problem is: 'regression' (because dtype of label-column == int and many unique label-values observed).
Label info (max, min, mean, stddev): (977, 1, 191.57413, 181.14445)
If 'regression' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 1765.66 MB
Train Data (Original) Memory Usage: 0.82 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 2 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Fitting DatetimeFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'demand']
('datetime', []) : 1 | ['datetime']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 5 | ['holiday', 'workingday', 'humidity', 'month', 'hour']
Types of features in processed data (raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'demand']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 3 | ['humidity', 'month', 'hour']
('int', ['bool']) : 2 | ['holiday', 'workingday']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
0.1s = Fit runtime
12 features in original data used to generate 16 features in processed data.
Train Data (Processed) Memory Usage: 1.01 MB (0.1% of available memory)
Data preprocessing and feature engineering runtime = 0.19s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
To change this, specify the eval_metric parameter of Predictor()
AutoGluon will fit 2 stack levels (L1 to L2) ...
Fitting 11 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 399.77s of the 599.81s of remaining time.
-101.5462 = Validation score (root_mean_squared_error)
0.08s = Training runtime
0.1s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 399.31s of the 599.34s of remaining time.
-84.1251 = Validation score (root_mean_squared_error)
0.04s = Training runtime
0.1s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 398.89s of the 598.92s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-34.463 = Validation score (root_mean_squared_error)
77.93s = Training runtime
8.63s = Validation runtime
Fitting model: LightGBM_BAG_L1 ... Training model for up to 315.92s of the 515.96s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-33.3463 = Validation score (root_mean_squared_error)
50.32s = Training runtime
3.94s = Validation runtime
Fitting model: RandomForestMSE_BAG_L1 ... Training model for up to 260.84s of the 460.88s of remaining time.
-38.5845 = Validation score (root_mean_squared_error)
11.85s = Training runtime
0.5s = Validation runtime
Fitting model: CatBoost_BAG_L1 ... Training model for up to 246.11s of the 446.15s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-34.1977 = Validation score (root_mean_squared_error)
206.9s = Training runtime
0.21s = Validation runtime
Fitting model: ExtraTreesMSE_BAG_L1 ... Training model for up to 34.9s of the 234.93s of remaining time.
-39.0671 = Validation score (root_mean_squared_error)
5.35s = Training runtime
0.49s = Validation runtime
Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 26.56s of the 226.59s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-70.7318 = Validation score (root_mean_squared_error)
42.43s = Training runtime
0.49s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L2 ... Training model for up to 360.0s of the 181.09s of remaining time.
-32.1239 = Validation score (root_mean_squared_error)
0.91s = Training runtime
0.0s = Validation runtime
Fitting 9 L2 models ...
Fitting model: LightGBMXT_BAG_L2 ... Training model for up to 180.07s of the 180.05s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-31.4333 = Validation score (root_mean_squared_error)
27.25s = Training runtime
0.72s = Validation runtime
Fitting model: LightGBM_BAG_L2 ... Training model for up to 149.63s of the 149.61s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-30.5722 = Validation score (root_mean_squared_error)
23.52s = Training runtime
0.26s = Validation runtime
Fitting model: RandomForestMSE_BAG_L2 ... Training model for up to 121.3s of the 121.28s of remaining time.
-31.8415 = Validation score (root_mean_squared_error)
29.54s = Training runtime
0.55s = Validation runtime
Fitting model: CatBoost_BAG_L2 ... Training model for up to 88.83s of the 88.81s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-30.5227 = Validation score (root_mean_squared_error)
80.58s = Training runtime
0.12s = Validation runtime
Fitting model: ExtraTreesMSE_BAG_L2 ... Training model for up to 5.31s of the 5.29s of remaining time.
-31.5164 = Validation score (root_mean_squared_error)
8.86s = Training runtime
0.56s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L3 ... Training model for up to 360.0s of the -6.93s of remaining time.
-30.2597 = Validation score (root_mean_squared_error)
0.36s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 607.57s ... Best model: "WeightedEnsemble_L3"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20220317_184828/")
predictor_new_features.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
model score_val pred_time_val fit_time pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L3 -30.259650 15.548556 526.614877 0.000750 0.355583 3 True 15
1 CatBoost_BAG_L2 -30.522719 14.571728 475.481483 0.117465 80.577325 2 True 13
2 LightGBM_BAG_L2 -30.572179 14.710463 418.428192 0.256200 23.524034 2 True 11
3 LightGBMXT_BAG_L2 -31.433263 15.174141 422.157936 0.719877 27.253777 2 True 10
4 ExtraTreesMSE_BAG_L2 -31.516369 15.012706 403.765763 0.558442 8.861604 2 True 14
5 RandomForestMSE_BAG_L2 -31.841482 15.007791 424.445922 0.553528 29.541763 2 True 12
6 WeightedEnsemble_L2 -32.123931 13.376248 347.944103 0.000746 0.905110 2 True 9
7 LightGBM_BAG_L1 -33.346307 3.938998 50.315295 3.938998 50.315295 1 True 4
8 CatBoost_BAG_L1 -34.197651 0.205096 206.901348 0.205096 206.901348 1 True 6
9 LightGBMXT_BAG_L1 -34.463021 8.629419 77.927042 8.629419 77.927042 1 True 3
10 RandomForestMSE_BAG_L1 -38.584500 0.497963 11.851322 0.497963 11.851322 1 True 5
11 ExtraTreesMSE_BAG_L1 -39.067132 0.485636 5.350195 0.485636 5.350195 1 True 7
12 NeuralNetFastAI_BAG_L1 -70.731815 0.489283 42.432183 0.489283 42.432183 1 True 8
13 KNeighborsDist_BAG_L1 -84.125061 0.104026 0.043987 0.104026 0.043987 1 True 2
14 KNeighborsUnif_BAG_L1 -101.546199 0.103842 0.082787 0.103842 0.082787 1 True 1
Number of models trained: 15
Types of models trained:
{'WeightedEnsembleModel', 'StackerEnsembleModel_NNFastAiTabular', 'StackerEnsembleModel_RF', 'StackerEnsembleModel_KNN', 'StackerEnsembleModel_LGB', 'StackerEnsembleModel_CatBoost', 'StackerEnsembleModel_XT'}
Bagging used: True (with 8 folds)
Multi-layer stack-ensembling used: True (with 3 levels)
Feature Metadata (Processed):
(raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'demand']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 3 | ['humidity', 'month', 'hour']
('int', ['bool']) : 2 | ['holiday', 'workingday']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
Plot summary of models saved to file: AutogluonModels/ag-20220317_184828/SummaryOfModels.html
*** End of fit() summary ***
{'model_types': {'KNeighborsUnif_BAG_L1': 'StackerEnsembleModel_KNN',
'KNeighborsDist_BAG_L1': 'StackerEnsembleModel_KNN',
'LightGBMXT_BAG_L1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L1': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L1': 'StackerEnsembleModel_CatBoost',
'ExtraTreesMSE_BAG_L1': 'StackerEnsembleModel_XT',
'NeuralNetFastAI_BAG_L1': 'StackerEnsembleModel_NNFastAiTabular',
'WeightedEnsemble_L2': 'WeightedEnsembleModel',
'LightGBMXT_BAG_L2': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L2': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L2': 'StackerEnsembleModel_CatBoost',
'ExtraTreesMSE_BAG_L2': 'StackerEnsembleModel_XT',
'WeightedEnsemble_L3': 'WeightedEnsembleModel'},
'model_performance': {'KNeighborsUnif_BAG_L1': -101.54619908446061,
'KNeighborsDist_BAG_L1': -84.12506123181602,
'LightGBMXT_BAG_L1': -34.463021343372745,
'LightGBM_BAG_L1': -33.346306837108116,
'RandomForestMSE_BAG_L1': -38.58450049457067,
'CatBoost_BAG_L1': -34.19765054632351,
'ExtraTreesMSE_BAG_L1': -39.06713183023691,
'NeuralNetFastAI_BAG_L1': -70.73181509028429,
'WeightedEnsemble_L2': -32.123930869198965,
'LightGBMXT_BAG_L2': -31.43326291390496,
'LightGBM_BAG_L2': -30.572179197902486,
'RandomForestMSE_BAG_L2': -31.841481949600695,
'CatBoost_BAG_L2': -30.522718520972884,
'ExtraTreesMSE_BAG_L2': -31.516369272598922,
'WeightedEnsemble_L3': -30.25965032399917},
'model_best': 'WeightedEnsemble_L3',
'model_paths': {'KNeighborsUnif_BAG_L1': 'AutogluonModels/ag-20220317_184828/models/KNeighborsUnif_BAG_L1/',
'KNeighborsDist_BAG_L1': 'AutogluonModels/ag-20220317_184828/models/KNeighborsDist_BAG_L1/',
'LightGBMXT_BAG_L1': 'AutogluonModels/ag-20220317_184828/models/LightGBMXT_BAG_L1/',
'LightGBM_BAG_L1': 'AutogluonModels/ag-20220317_184828/models/LightGBM_BAG_L1/',
'RandomForestMSE_BAG_L1': 'AutogluonModels/ag-20220317_184828/models/RandomForestMSE_BAG_L1/',
'CatBoost_BAG_L1': 'AutogluonModels/ag-20220317_184828/models/CatBoost_BAG_L1/',
'ExtraTreesMSE_BAG_L1': 'AutogluonModels/ag-20220317_184828/models/ExtraTreesMSE_BAG_L1/',
'NeuralNetFastAI_BAG_L1': 'AutogluonModels/ag-20220317_184828/models/NeuralNetFastAI_BAG_L1/',
'WeightedEnsemble_L2': 'AutogluonModels/ag-20220317_184828/models/WeightedEnsemble_L2/',
'LightGBMXT_BAG_L2': 'AutogluonModels/ag-20220317_184828/models/LightGBMXT_BAG_L2/',
'LightGBM_BAG_L2': 'AutogluonModels/ag-20220317_184828/models/LightGBM_BAG_L2/',
'RandomForestMSE_BAG_L2': 'AutogluonModels/ag-20220317_184828/models/RandomForestMSE_BAG_L2/',
'CatBoost_BAG_L2': 'AutogluonModels/ag-20220317_184828/models/CatBoost_BAG_L2/',
'ExtraTreesMSE_BAG_L2': 'AutogluonModels/ag-20220317_184828/models/ExtraTreesMSE_BAG_L2/',
'WeightedEnsemble_L3': 'AutogluonModels/ag-20220317_184828/models/WeightedEnsemble_L3/'},
'model_fit_times': {'KNeighborsUnif_BAG_L1': 0.08278703689575195,
'KNeighborsDist_BAG_L1': 0.043987274169921875,
'LightGBMXT_BAG_L1': 77.92704153060913,
'LightGBM_BAG_L1': 50.31529498100281,
'RandomForestMSE_BAG_L1': 11.851321935653687,
'CatBoost_BAG_L1': 206.9013478755951,
'ExtraTreesMSE_BAG_L1': 5.350195407867432,
'NeuralNetFastAI_BAG_L1': 42.4321825504303,
'WeightedEnsemble_L2': 0.9051096439361572,
'LightGBMXT_BAG_L2': 27.253777265548706,
'LightGBM_BAG_L2': 23.524033784866333,
'RandomForestMSE_BAG_L2': 29.541763067245483,
'CatBoost_BAG_L2': 80.57732462882996,
'ExtraTreesMSE_BAG_L2': 8.86160397529602,
'WeightedEnsemble_L3': 0.35558319091796875},
'model_pred_times': {'KNeighborsUnif_BAG_L1': 0.10384178161621094,
'KNeighborsDist_BAG_L1': 0.10402631759643555,
'LightGBMXT_BAG_L1': 8.629418849945068,
'LightGBM_BAG_L1': 3.938997983932495,
'RandomForestMSE_BAG_L1': 0.49796319007873535,
'CatBoost_BAG_L1': 0.20509576797485352,
'ExtraTreesMSE_BAG_L1': 0.48563599586486816,
'NeuralNetFastAI_BAG_L1': 0.48928332328796387,
'WeightedEnsemble_L2': 0.0007462501525878906,
'LightGBMXT_BAG_L2': 0.7198774814605713,
'LightGBM_BAG_L2': 0.25620007514953613,
'RandomForestMSE_BAG_L2': 0.5535280704498291,
'CatBoost_BAG_L2': 0.11746501922607422,
'ExtraTreesMSE_BAG_L2': 0.5584423542022705,
'WeightedEnsemble_L3': 0.0007498264312744141},
'num_bag_folds': 8,
'max_stack_level': 3,
'model_hyperparams': {'KNeighborsUnif_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'KNeighborsDist_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'LightGBMXT_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'ExtraTreesMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'NeuralNetFastAI_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L2': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBMXT_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'ExtraTreesMSE_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'WeightedEnsemble_L3': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True}},
'leaderboard': model score_val pred_time_val fit_time \
0 WeightedEnsemble_L3 -30.259650 15.548556 526.614877
1 CatBoost_BAG_L2 -30.522719 14.571728 475.481483
2 LightGBM_BAG_L2 -30.572179 14.710463 418.428192
3 LightGBMXT_BAG_L2 -31.433263 15.174141 422.157936
4 ExtraTreesMSE_BAG_L2 -31.516369 15.012706 403.765763
5 RandomForestMSE_BAG_L2 -31.841482 15.007791 424.445922
6 WeightedEnsemble_L2 -32.123931 13.376248 347.944103
7 LightGBM_BAG_L1 -33.346307 3.938998 50.315295
8 CatBoost_BAG_L1 -34.197651 0.205096 206.901348
9 LightGBMXT_BAG_L1 -34.463021 8.629419 77.927042
10 RandomForestMSE_BAG_L1 -38.584500 0.497963 11.851322
11 ExtraTreesMSE_BAG_L1 -39.067132 0.485636 5.350195
12 NeuralNetFastAI_BAG_L1 -70.731815 0.489283 42.432183
13 KNeighborsDist_BAG_L1 -84.125061 0.104026 0.043987
14 KNeighborsUnif_BAG_L1 -101.546199 0.103842 0.082787
pred_time_val_marginal fit_time_marginal stack_level can_infer \
0 0.000750 0.355583 3 True
1 0.117465 80.577325 2 True
2 0.256200 23.524034 2 True
3 0.719877 27.253777 2 True
4 0.558442 8.861604 2 True
5 0.553528 29.541763 2 True
6 0.000746 0.905110 2 True
7 3.938998 50.315295 1 True
8 0.205096 206.901348 1 True
9 8.629419 77.927042 1 True
10 0.497963 11.851322 1 True
11 0.485636 5.350195 1 True
12 0.489283 42.432183 1 True
13 0.104026 0.043987 1 True
14 0.103842 0.082787 1 True
fit_order
0 15
1 13
2 11
3 10
4 14
5 12
6 9
7 4
8 6
9 3
10 5
11 7
12 8
13 2
14 1 }
predictions_new_features = predictor_new_features.predict(test)
predictions.head()
0 24.718138 1 41.588139 2 46.161873 3 49.389935 4 52.169258 Name: count, dtype: float32
# Describe the `predictions` series to see if there are any negative values
predictions_new_features.describe()
count 6493.000000 mean 154.073578 std 132.309128 min 2.485860 25% 53.044235 50% 120.548828 75% 217.150909 max 796.222656 Name: count, dtype: float64
# How many negative values do we have?
len(predictions_new_features[predictions_new_features<0])
0
# Set them to zero (just here in case I rerun and we have negative values)
predictions_new_features[predictions_new_features < 0] = 0
# Same submitting predictions
submission_new_features = pd.read_csv('sampleSubmission.csv', parse_dates=['datetime'])
submission_new_features["count"] = predictions_new_features
submission_new_features.to_csv("submission_new_features.csv", index=False)
!kaggle competitions submit -c bike-sharing-demand -f submission_new_features.csv -m "new features"
100%|█████████████████████████████████████████| 188k/188k [00:00<00:00, 442kB/s] Successfully submitted to Bike Sharing Demand
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6
fileName date description status publicScore privateScore --------------------------- ------------------- -------------------- -------- ----------- ------------ submission_new_features.csv 2022-03-17 19:06:19 new features complete 0.67065 0.67065 submission.csv 2022-03-17 16:28:48 first raw submission complete 1.80373 1.80373 submission.csv 2022-03-14 16:28:05 first raw submission complete 1.80918 1.80918
So just by adding a few additional features we have significantly improved the model performance.
hyperparameter and hyperparameter_tune_kwargs arguments.train_data = train.loc[train['datetime'].dt.day < 17]
val_data = train.loc[train['datetime'].dt.day >= 17]
len(train_data), len(val_data)
(9174, 1712)
The hyperparameers defined below for the nearal network and GMN models are based on the instructions given on the autogluon website. The default settings in Autogluon does not train these types of models.
import autogluon.core as ag
time_limit = 30*60 # train various models for ~30 mins
num_trials = 10 # try at most 5 different hyperparameter configurations for each type of model
search_strategy = 'auto' # to tune hyperparameters using random search routine with a local scheduler
hyperparameter_tune_kwargs = {
'num_trials': num_trials,
'scheduler' : 'local',
'searcher': search_strategy,
}
# Non-default hyperparameter values for neural network models
nn_options = {
'num_epochs': 30, # number of training epochs (controls training time of NN models)
'learning_rate': ag.space.Real(1e-5, 1e-1, default=1e-3, log=True), # learning rate used in training (real-valued hyperparameter searched on log-scale)
'activation': ag.space.Categorical('relu', 'softrelu', 'sigmoid','tanh'), # activation function used in NN (categorical hyperparameter, default = first entry)
'dropout_prob': ag.space.Real(0.0, 0.5, default=0.1), # dropout probability (real-valued hyperparameter)
}
gbm_options = { # specifies non-default hyperparameter values for lightGBM gradient boosted trees
'num_boost_round': 100, # number of boosting rounds (controls training time of GBM models)
'num_leaves': ag.space.Int(lower=26, upper=66, default=36), # number of leaves in trees (integer hyperparameter)
}
hyperparameters = { # hyperparameters of each model type
'GBM': gbm_options,
'NN_TORCH': nn_options,
} # When these keys are missing from hyperparameters dict, no models of that type are trained
predictor_new_hpo = TabularPredictor(label='count', eval_metric='root_mean_squared_error').fit(
train, ignored_columns=['registered', 'casual'], time_limit=time_limit, presets='best_quality',
auto_stack=True, num_bag_folds=5, num_bag_sets=5, num_stack_levels=2,
hyperparameters=hyperparameters, hyperparameter_tune_kwargs=hyperparameter_tune_kwargs)
No path specified. Models will be saved in: "AutogluonModels/ag-20220317_191204/"
Presets specified: ['best_quality']
Warning: hyperparameter tuning is currently experimental and may cause the process to hang.
Beginning AutoGluon training ... Time limit = 1800s
AutoGluon will save models to "AutogluonModels/ag-20220317_191204/"
AutoGluon Version: 0.4.0
Python Version: 3.7.10
Operating System: Linux
Train Data Rows: 9174
Train Data Columns: 12
Label Column: count
Preprocessing data ...
AutoGluon infers your prediction problem is: 'regression' (because dtype of label-column == int and many unique label-values observed).
Label info (max, min, mean, stddev): (977, 1, 190.58175, 181.01153)
If 'regression' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 1868.18 MB
Train Data (Original) Memory Usage: 0.69 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 2 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Fitting DatetimeFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'demand']
('datetime', []) : 1 | ['datetime']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 5 | ['holiday', 'workingday', 'humidity', 'month', 'hour']
Types of features in processed data (raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'demand']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 3 | ['humidity', 'month', 'hour']
('int', ['bool']) : 2 | ['holiday', 'workingday']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
0.2s = Fit runtime
12 features in original data used to generate 16 features in processed data.
Train Data (Processed) Memory Usage: 0.85 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.26s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
To change this, specify the eval_metric parameter of Predictor()
AutoGluon will fit 3 stack levels (L1 to L3) ...
Fitting 2 L1 models ...
Hyperparameter tuning model: LightGBM_BAG_L1 ... Tuning model for up to 71.97s of the 1799.73s of remaining time.
Fitted model: LightGBM_BAG_L1/T1 ... -42.2527 = Validation score (root_mean_squared_error) 0.65s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L1/T2 ... -41.4396 = Validation score (root_mean_squared_error) 0.63s = Training runtime 0.02s = Validation runtime Fitted model: LightGBM_BAG_L1/T3 ... -41.8544 = Validation score (root_mean_squared_error) 0.7s = Training runtime 0.02s = Validation runtime Fitted model: LightGBM_BAG_L1/T4 ... -118.1099 = Validation score (root_mean_squared_error) 0.65s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L1/T5 ... -44.7137 = Validation score (root_mean_squared_error) 0.66s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L1/T6 ... -109.5866 = Validation score (root_mean_squared_error) 0.69s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L1/T7 ... -40.6877 = Validation score (root_mean_squared_error) 0.63s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L1/T8 ... -40.3264 = Validation score (root_mean_squared_error) 0.67s = Training runtime 0.02s = Validation runtime Fitted model: LightGBM_BAG_L1/T9 ... -106.4783 = Validation score (root_mean_squared_error) 0.63s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L1/T10 ... -38.3475 = Validation score (root_mean_squared_error) 0.63s = Training runtime 0.01s = Validation runtime Hyperparameter tuning model: NeuralNetTorch_BAG_L1 ... Tuning model for up to 71.97s of the 1790.82s of remaining time.
Stopping HPO to satisfy time limit...
Fitted model: NeuralNetTorch_BAG_L1/T1 ...
-64.8242 = Validation score (root_mean_squared_error)
10.14s = Training runtime
0.04s = Validation runtime
Fitted model: NeuralNetTorch_BAG_L1/T2 ...
-58.9421 = Validation score (root_mean_squared_error)
17.62s = Training runtime
0.04s = Validation runtime
Fitted model: NeuralNetTorch_BAG_L1/T3 ...
-634.2998 = Validation score (root_mean_squared_error)
27.28s = Training runtime
0.1s = Validation runtime
Fitting model: LightGBM_BAG_L1/T1 ... Training model for up to 734.39s of the 1734.44s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-39.956 = Validation score (root_mean_squared_error)
7.67s = Training runtime
0.11s = Validation runtime
Fitting model: LightGBM_BAG_L1/T2 ... Training model for up to 724.43s of the 1724.48s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-38.6603 = Validation score (root_mean_squared_error)
7.51s = Training runtime
0.11s = Validation runtime
Fitting model: LightGBM_BAG_L1/T3 ... Training model for up to 714.44s of the 1714.49s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-39.3444 = Validation score (root_mean_squared_error)
7.88s = Training runtime
0.13s = Validation runtime
Fitting model: LightGBM_BAG_L1/T4 ... Training model for up to 703.82s of the 1703.87s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-117.0263 = Validation score (root_mean_squared_error)
8.22s = Training runtime
0.09s = Validation runtime
Fitting model: LightGBM_BAG_L1/T5 ... Training model for up to 692.95s of the 1693.0s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
2022-03-17 19:13:56,978 WARNING worker.py:1228 -- The actor or task with ID 00693b09e1f5e4da43ebfc3ea59e0ab05b962643c31a17eb cannot be scheduled right now. You can ignore this message if this Ray cluster is expected to auto-scale or if you specified a runtime_env for this actor or task, which may take time to install. Otherwise, this is likely due to all cluster resources being claimed by actors. To resolve the issue, consider creating fewer actors or increasing the resources available to this Ray cluster.
Required resources for this actor or task: {CPU: 1.000000}
Available resources on this node: {0.000000/2.000000 CPU, 75224640.039062 GiB/75224640.039062 GiB memory, 37612319.970703 GiB/37612319.970703 GiB object_store_memory, 1.000000/1.000000 node:169.255.254.2}
In total there are 2 pending tasks and 0 pending actors on this node.
-42.9628 = Validation score (root_mean_squared_error)
8.18s = Training runtime
0.11s = Validation runtime
Fitting model: LightGBM_BAG_L1/T6 ... Training model for up to 682.06s of the 1682.11s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-108.8803 = Validation score (root_mean_squared_error)
7.85s = Training runtime
0.09s = Validation runtime
Fitting model: LightGBM_BAG_L1/T7 ... Training model for up to 671.39s of the 1671.44s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-37.8155 = Validation score (root_mean_squared_error)
8.42s = Training runtime
0.11s = Validation runtime
Fitting model: LightGBM_BAG_L1/T8 ... Training model for up to 659.49s of the 1659.54s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-37.0714 = Validation score (root_mean_squared_error)
8.74s = Training runtime
0.12s = Validation runtime
Fitting model: LightGBM_BAG_L1/T9 ... Training model for up to 647.35s of the 1647.4s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-105.4165 = Validation score (root_mean_squared_error)
8.48s = Training runtime
0.09s = Validation runtime
Fitting model: LightGBM_BAG_L1/T10 ... Training model for up to 635.65s of the 1635.7s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-35.8886 = Validation score (root_mean_squared_error)
8.18s = Training runtime
0.13s = Validation runtime
Fitting model: NeuralNetTorch_BAG_L1/T1 ... Training model for up to 625.13s of the 1625.18s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-63.8614 = Validation score (root_mean_squared_error)
49.85s = Training runtime
0.3s = Validation runtime
Fitting model: NeuralNetTorch_BAG_L1/T2 ... Training model for up to 582.15s of the 1582.2s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-55.2096 = Validation score (root_mean_squared_error)
85.37s = Training runtime
0.26s = Validation runtime
Fitting model: NeuralNetTorch_BAG_L1/T3 ... Training model for up to 509.79s of the 1509.84s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-405.7627 = Validation score (root_mean_squared_error)
106.74s = Training runtime
0.85s = Validation runtime
Completed 1/5 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L2 ... Training model for up to 360.0s of the 1426.62s of remaining time.
-35.5483 = Validation score (root_mean_squared_error)
0.66s = Training runtime
0.0s = Validation runtime
Fitting 2 L2 models ...
Hyperparameter tuning model: LightGBM_BAG_L2 ... Tuning model for up to 85.53s of the 1425.82s of remaining time.
Fitted model: LightGBM_BAG_L2/T1 ... -35.3987 = Validation score (root_mean_squared_error) 0.83s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L2/T2 ... -36.0473 = Validation score (root_mean_squared_error) 0.76s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L2/T3 ... -35.16 = Validation score (root_mean_squared_error) 0.96s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L2/T4 ... -101.9243 = Validation score (root_mean_squared_error) 0.8s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L2/T5 ... -35.5939 = Validation score (root_mean_squared_error) 0.85s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L2/T6 ... -99.0164 = Validation score (root_mean_squared_error) 0.91s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L2/T7 ... -35.4672 = Validation score (root_mean_squared_error) 0.73s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L2/T8 ... -34.981 = Validation score (root_mean_squared_error) 0.92s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L2/T9 ... -88.9118 = Validation score (root_mean_squared_error) 0.79s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L2/T10 ... -35.5412 = Validation score (root_mean_squared_error) 0.78s = Training runtime 0.01s = Validation runtime Hyperparameter tuning model: NeuralNetTorch_BAG_L2 ... Tuning model for up to 85.53s of the 1415.21s of remaining time.
Ran out of time, stopping training early. (Stopping on epoch 29) Stopping HPO to satisfy time limit... Fitted model: NeuralNetTorch_BAG_L2/T1 ... -35.6624 = Validation score (root_mean_squared_error) 13.44s = Training runtime 0.06s = Validation runtime Fitted model: NeuralNetTorch_BAG_L2/T2 ... -36.4908 = Validation score (root_mean_squared_error) 21.96s = Training runtime 0.06s = Validation runtime Fitted model: NeuralNetTorch_BAG_L2/T3 ... -309.6203 = Validation score (root_mean_squared_error) 31.98s = Training runtime 0.18s = Validation runtime Fitting model: LightGBM_BAG_L2/T1 ... Training model for up to 870.73s of the 1346.22s of remaining time. Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy -35.3996 = Validation score (root_mean_squared_error) 9.72s = Training runtime 0.12s = Validation runtime Fitting model: LightGBM_BAG_L2/T2 ... Training model for up to 857.97s of the 1333.46s of remaining time. Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy -35.7387 = Validation score (root_mean_squared_error) 10.04s = Training runtime 0.1s = Validation runtime Fitting model: LightGBM_BAG_L2/T3 ... Training model for up to 844.11s of the 1319.6s of remaining time. Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy -35.3795 = Validation score (root_mean_squared_error) 9.48s = Training runtime 0.12s = Validation runtime Fitting model: LightGBM_BAG_L2/T4 ... Training model for up to 832.24s of the 1307.72s of remaining time. Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy -102.2379 = Validation score (root_mean_squared_error) 9.53s = Training runtime 0.1s = Validation runtime Fitting model: LightGBM_BAG_L2/T5 ... Training model for up to 819.25s of the 1294.74s of remaining time. Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy -35.7019 = Validation score (root_mean_squared_error) 8.78s = Training runtime 0.1s = Validation runtime Fitting model: LightGBM_BAG_L2/T6 ... Training model for up to 807.98s of the 1283.47s of remaining time. Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy -99.2001 = Validation score (root_mean_squared_error) 9.18s = Training runtime 0.1s = Validation runtime Fitting model: LightGBM_BAG_L2/T7 ... Training model for up to 796.42s of the 1271.91s of remaining time. Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy -35.4836 = Validation score (root_mean_squared_error) 8.44s = Training runtime 0.08s = Validation runtime Fitting model: LightGBM_BAG_L2/T8 ... Training model for up to 785.76s of the 1261.24s of remaining time. Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy -35.3704 = Validation score (root_mean_squared_error) 10.02s = Training runtime 0.08s = Validation runtime Fitting model: LightGBM_BAG_L2/T9 ... Training model for up to 773.37s of the 1248.85s of remaining time. Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy -89.2787 = Validation score (root_mean_squared_error) 8.97s = Training runtime 0.11s = Validation runtime Fitting model: LightGBM_BAG_L2/T10 ... Training model for up to 761.02s of the 1236.51s of remaining time. Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy -35.4791 = Validation score (root_mean_squared_error) 9.45s = Training runtime 0.05s = Validation runtime Fitting model: NeuralNetTorch_BAG_L2/T1 ... Training model for up to 748.8s of the 1224.29s of remaining time. Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy -36.0893 = Validation score (root_mean_squared_error) 53.82s = Training runtime 0.41s = Validation runtime Fitting model: NeuralNetTorch_BAG_L2/T2 ... Training model for up to 705.13s of the 1180.62s of remaining time. Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy -36.6699 = Validation score (root_mean_squared_error) 88.48s = Training runtime 0.31s = Validation runtime Fitting model: NeuralNetTorch_BAG_L2/T3 ... Training model for up to 634.65s of the 1110.14s of remaining time. Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy -328.4201 = Validation score (root_mean_squared_error) 135.68s = Training runtime 1.25s = Validation runtime Completed 1/5 k-fold bagging repeats ... Fitting model: WeightedEnsemble_L3 ... Training model for up to 360.0s of the 1002.46s of remaining time. -34.813 = Validation score (root_mean_squared_error) 0.67s = Training runtime 0.0s = Validation runtime Fitting 2 L3 models ... Hyperparameter tuning model: LightGBM_BAG_L3 ... Tuning model for up to 90.15s of the 1001.66s of remaining time.
Fitted model: LightGBM_BAG_L3/T1 ... -35.3391 = Validation score (root_mean_squared_error) 0.8s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L3/T2 ... -35.1698 = Validation score (root_mean_squared_error) 0.76s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L3/T3 ... -35.5969 = Validation score (root_mean_squared_error) 0.93s = Training runtime 0.02s = Validation runtime Fitted model: LightGBM_BAG_L3/T4 ... -103.6701 = Validation score (root_mean_squared_error) 0.78s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L3/T5 ... -35.5011 = Validation score (root_mean_squared_error) 0.85s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L3/T6 ... -100.4742 = Validation score (root_mean_squared_error) 0.9s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L3/T7 ... -35.1441 = Validation score (root_mean_squared_error) 0.71s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L3/T8 ... -35.5866 = Validation score (root_mean_squared_error) 0.91s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L3/T9 ... -90.3702 = Validation score (root_mean_squared_error) 0.76s = Training runtime 0.01s = Validation runtime Fitted model: LightGBM_BAG_L3/T10 ... -35.4085 = Validation score (root_mean_squared_error) 0.76s = Training runtime 0.01s = Validation runtime Hyperparameter tuning model: NeuralNetTorch_BAG_L3 ... Tuning model for up to 90.15s of the 991.27s of remaining time.
Ran out of time, stopping training early. (Stopping on epoch 18)
Stopping HPO to satisfy time limit...
Fitted model: NeuralNetTorch_BAG_L3/T1 ...
-36.1775 = Validation score (root_mean_squared_error)
9.38s = Training runtime
0.05s = Validation runtime
Fitted model: NeuralNetTorch_BAG_L3/T2 ...
-36.4663 = Validation score (root_mean_squared_error)
17.18s = Training runtime
0.04s = Validation runtime
Fitted model: NeuralNetTorch_BAG_L3/T3 ...
-284.696 = Validation score (root_mean_squared_error)
26.81s = Training runtime
0.24s = Validation runtime
Fitted model: NeuralNetTorch_BAG_L3/T4 ...
-37.1584 = Validation score (root_mean_squared_error)
18.02s = Training runtime
0.07s = Validation runtime
Fitting model: LightGBM_BAG_L3/T1 ... Training model for up to 917.73s of the 917.7s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-35.6874 = Validation score (root_mean_squared_error)
9.04s = Training runtime
0.12s = Validation runtime
Fitting model: LightGBM_BAG_L3/T2 ... Training model for up to 905.78s of the 905.74s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-35.4363 = Validation score (root_mean_squared_error)
8.88s = Training runtime
0.09s = Validation runtime
Fitting model: LightGBM_BAG_L3/T3 ... Training model for up to 893.41s of the 893.37s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-35.9269 = Validation score (root_mean_squared_error)
9.0s = Training runtime
0.11s = Validation runtime
Fitting model: LightGBM_BAG_L3/T4 ... Training model for up to 882.38s of the 882.32s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-102.007 = Validation score (root_mean_squared_error)
8.31s = Training runtime
0.09s = Validation runtime
Fitting model: LightGBM_BAG_L3/T5 ... Training model for up to 871.23s of the 871.19s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-35.9667 = Validation score (root_mean_squared_error)
9.15s = Training runtime
0.1s = Validation runtime
Fitting model: LightGBM_BAG_L3/T6 ... Training model for up to 858.63s of the 858.59s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-98.9686 = Validation score (root_mean_squared_error)
9.24s = Training runtime
0.12s = Validation runtime
Fitting model: LightGBM_BAG_L3/T7 ... Training model for up to 845.87s of the 845.84s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-35.4827 = Validation score (root_mean_squared_error)
8.28s = Training runtime
0.07s = Validation runtime
Fitting model: LightGBM_BAG_L3/T8 ... Training model for up to 835.25s of the 835.22s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-35.9222 = Validation score (root_mean_squared_error)
9.5s = Training runtime
0.06s = Validation runtime
Fitting model: LightGBM_BAG_L3/T9 ... Training model for up to 822.22s of the 822.19s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-89.0275 = Validation score (root_mean_squared_error)
9.19s = Training runtime
0.09s = Validation runtime
Fitting model: LightGBM_BAG_L3/T10 ... Training model for up to 809.4s of the 809.37s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-35.8028 = Validation score (root_mean_squared_error)
8.72s = Training runtime
0.07s = Validation runtime
Fitting model: NeuralNetTorch_BAG_L3/T1 ... Training model for up to 798.41s of the 798.38s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-36.2485 = Validation score (root_mean_squared_error)
50.67s = Training runtime
0.43s = Validation runtime
Fitting model: NeuralNetTorch_BAG_L3/T2 ... Training model for up to 753.01s of the 752.97s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-37.6199 = Validation score (root_mean_squared_error)
86.69s = Training runtime
0.28s = Validation runtime
Fitting model: NeuralNetTorch_BAG_L3/T3 ... Training model for up to 679.23s of the 679.19s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-417.2257 = Validation score (root_mean_squared_error)
104.7s = Training runtime
1.0s = Validation runtime
Fitting model: NeuralNetTorch_BAG_L3/T4 ... Training model for up to 597.86s of the 597.83s of remaining time.
Fitting 4 child models (S1F2 - S1F5) | Fitting with ParallelLocalFoldFittingStrategy
-37.121 = Validation score (root_mean_squared_error)
132.61s = Training runtime
0.61s = Validation runtime
Completed 1/5 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L4 ... Training model for up to 360.0s of the 479.8s of remaining time.
-35.1113 = Validation score (root_mean_squared_error)
0.7s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 1321.17s ... Best model: "WeightedEnsemble_L3"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20220317_191204/")
predictor_new_hpo.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
model score_val pred_time_val fit_time pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L3 -34.813010 3.672309 513.167733 0.001020 0.665498 3 True 28
1 WeightedEnsemble_L4 -35.111313 6.914216 982.520030 0.000695 0.704736 4 True 43
2 LightGBM_BAG_L2/T8 -35.370413 2.576208 333.106889 0.082673 10.020209 2 True 22
3 LightGBM_BAG_L2/T3 -35.379546 2.616498 332.562367 0.122963 9.475686 2 True 17
4 LightGBM_BAG_L2/T1 -35.399568 2.610327 332.805831 0.116792 9.719151 2 True 15
5 LightGBM_BAG_L3/T2 -35.436314 5.517697 703.559947 0.088392 8.884714 3 True 30
6 LightGBM_BAG_L2/T10 -35.479069 2.546378 332.541213 0.052843 9.454533 2 True 24
7 LightGBM_BAG_L3/T7 -35.482654 5.498135 702.956028 0.068830 8.280795 3 True 35
8 LightGBM_BAG_L2/T7 -35.483605 2.573981 331.529147 0.080446 8.442467 2 True 21
9 WeightedEnsemble_L2 -35.548268 0.510324 102.946178 0.000756 0.661998 2 True 14
10 LightGBM_BAG_L3/T1 -35.687376 5.546787 703.716944 0.117481 9.041712 3 True 29
11 LightGBM_BAG_L2/T5 -35.701899 2.595537 331.862314 0.102002 8.775634 2 True 19
12 LightGBM_BAG_L2/T2 -35.738677 2.591325 333.131061 0.097790 10.044380 2 True 16
13 LightGBM_BAG_L3/T10 -35.802838 5.501674 703.391163 0.072368 8.715930 3 True 38
14 LightGBM_BAG_L1/T10 -35.888562 0.125862 8.176670 0.125862 8.176670 1 True 10
15 LightGBM_BAG_L3/T8 -35.922184 5.494029 704.176745 0.064723 9.501513 3 True 36
16 LightGBM_BAG_L3/T3 -35.926907 5.540156 703.672698 0.110851 8.997465 3 True 31
17 LightGBM_BAG_L3/T5 -35.966745 5.527014 703.825497 0.097709 9.150265 3 True 33
18 NeuralNetTorch_BAG_L2/T1 -36.089283 2.905886 376.906223 0.412351 53.819543 2 True 25
19 NeuralNetTorch_BAG_L3/T1 -36.248508 5.863663 745.350188 0.434358 50.674956 3 True 39
20 NeuralNetTorch_BAG_L2/T2 -36.669852 2.803221 411.570647 0.309685 88.483967 2 True 26
21 LightGBM_BAG_L1/T8 -37.071398 0.120370 8.737298 0.120370 8.737298 1 True 8
22 NeuralNetTorch_BAG_L3/T4 -37.120968 6.037293 827.288448 0.607988 132.613215 3 True 42
23 NeuralNetTorch_BAG_L3/T2 -37.619868 5.713954 781.361614 0.284649 86.686382 3 True 40
24 LightGBM_BAG_L1/T7 -37.815541 0.113737 8.417716 0.113737 8.417716 1 True 7
25 LightGBM_BAG_L1/T2 -38.660252 0.108927 7.505810 0.108927 7.505810 1 True 2
26 LightGBM_BAG_L1/T3 -39.344382 0.131958 7.882296 0.131958 7.882296 1 True 3
27 LightGBM_BAG_L1/T1 -39.956017 0.107116 7.668992 0.107116 7.668992 1 True 1
28 LightGBM_BAG_L1/T5 -42.962833 0.105097 8.176471 0.105097 8.176471 1 True 5
29 NeuralNetTorch_BAG_L1/T2 -55.209600 0.263336 85.370211 0.263336 85.370211 1 True 12
30 NeuralNetTorch_BAG_L1/T1 -63.861354 0.298772 49.849620 0.298772 49.849620 1 True 11
31 LightGBM_BAG_L3/T9 -89.027523 5.516489 703.865573 0.087184 9.190340 3 True 37
32 LightGBM_BAG_L2/T9 -89.278716 2.600882 332.053745 0.107347 8.967065 2 True 23
33 LightGBM_BAG_L3/T6 -98.968568 5.549705 703.914863 0.120400 9.239630 3 True 34
34 LightGBM_BAG_L2/T6 -99.200054 2.593319 332.261962 0.099784 9.175281 2 True 20
35 LightGBM_BAG_L3/T4 -102.006964 5.519915 702.987590 0.090610 8.312357 3 True 32
36 LightGBM_BAG_L2/T4 -102.237873 2.589868 332.614861 0.096333 9.528181 2 True 18
37 LightGBM_BAG_L1/T9 -105.416537 0.085729 8.484714 0.085729 8.484714 1 True 9
38 LightGBM_BAG_L1/T6 -108.880311 0.087695 7.851140 0.087695 7.851140 1 True 6
39 LightGBM_BAG_L1/T4 -117.026282 0.094454 8.223054 0.094454 8.223054 1 True 4
40 NeuralNetTorch_BAG_L2/T3 -328.420101 3.748296 458.769137 1.254761 135.682457 2 True 27
41 NeuralNetTorch_BAG_L1/T3 -405.762688 0.850481 106.742689 0.850481 106.742689 1 True 13
42 NeuralNetTorch_BAG_L3/T3 -417.225687 6.426141 799.380106 0.996836 104.704874 3 True 41
Number of models trained: 43
Types of models trained:
{'WeightedEnsembleModel', 'StackerEnsembleModel_TabularNeuralNetTorch', 'StackerEnsembleModel_LGB'}
Bagging used: True (with 5 folds)
Multi-layer stack-ensembling used: True (with 4 levels)
Feature Metadata (Processed):
(raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'demand']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 3 | ['humidity', 'month', 'hour']
('int', ['bool']) : 2 | ['holiday', 'workingday']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
Plot summary of models saved to file: AutogluonModels/ag-20220317_191204/SummaryOfModels.html
*** End of fit() summary ***
{'model_types': {'LightGBM_BAG_L1/T1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1/T2': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1/T3': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1/T4': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1/T5': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1/T6': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1/T7': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1/T8': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1/T9': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1/T10': 'StackerEnsembleModel_LGB',
'NeuralNetTorch_BAG_L1/T1': 'StackerEnsembleModel_TabularNeuralNetTorch',
'NeuralNetTorch_BAG_L1/T2': 'StackerEnsembleModel_TabularNeuralNetTorch',
'NeuralNetTorch_BAG_L1/T3': 'StackerEnsembleModel_TabularNeuralNetTorch',
'WeightedEnsemble_L2': 'WeightedEnsembleModel',
'LightGBM_BAG_L2/T1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2/T2': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2/T3': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2/T4': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2/T5': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2/T6': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2/T7': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2/T8': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2/T9': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2/T10': 'StackerEnsembleModel_LGB',
'NeuralNetTorch_BAG_L2/T1': 'StackerEnsembleModel_TabularNeuralNetTorch',
'NeuralNetTorch_BAG_L2/T2': 'StackerEnsembleModel_TabularNeuralNetTorch',
'NeuralNetTorch_BAG_L2/T3': 'StackerEnsembleModel_TabularNeuralNetTorch',
'WeightedEnsemble_L3': 'WeightedEnsembleModel',
'LightGBM_BAG_L3/T1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L3/T2': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L3/T3': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L3/T4': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L3/T5': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L3/T6': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L3/T7': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L3/T8': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L3/T9': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L3/T10': 'StackerEnsembleModel_LGB',
'NeuralNetTorch_BAG_L3/T1': 'StackerEnsembleModel_TabularNeuralNetTorch',
'NeuralNetTorch_BAG_L3/T2': 'StackerEnsembleModel_TabularNeuralNetTorch',
'NeuralNetTorch_BAG_L3/T3': 'StackerEnsembleModel_TabularNeuralNetTorch',
'NeuralNetTorch_BAG_L3/T4': 'StackerEnsembleModel_TabularNeuralNetTorch',
'WeightedEnsemble_L4': 'WeightedEnsembleModel'},
'model_performance': {'LightGBM_BAG_L1/T1': -39.956016549966215,
'LightGBM_BAG_L1/T2': -38.66025231525806,
'LightGBM_BAG_L1/T3': -39.34438237824181,
'LightGBM_BAG_L1/T4': -117.02628155660842,
'LightGBM_BAG_L1/T5': -42.96283331635281,
'LightGBM_BAG_L1/T6': -108.88031056269212,
'LightGBM_BAG_L1/T7': -37.81554087250205,
'LightGBM_BAG_L1/T8': -37.07139813760213,
'LightGBM_BAG_L1/T9': -105.41653654070552,
'LightGBM_BAG_L1/T10': -35.88856244655827,
'NeuralNetTorch_BAG_L1/T1': -63.861353735123814,
'NeuralNetTorch_BAG_L1/T2': -55.209599656207914,
'NeuralNetTorch_BAG_L1/T3': -405.7626876597812,
'WeightedEnsemble_L2': -35.548267987794915,
'LightGBM_BAG_L2/T1': -35.39956772531782,
'LightGBM_BAG_L2/T2': -35.738677026894536,
'LightGBM_BAG_L2/T3': -35.37954621805104,
'LightGBM_BAG_L2/T4': -102.23787256035632,
'LightGBM_BAG_L2/T5': -35.7018994414588,
'LightGBM_BAG_L2/T6': -99.20005442813401,
'LightGBM_BAG_L2/T7': -35.483605304948604,
'LightGBM_BAG_L2/T8': -35.3704128514316,
'LightGBM_BAG_L2/T9': -89.27871604942428,
'LightGBM_BAG_L2/T10': -35.47906938755496,
'NeuralNetTorch_BAG_L2/T1': -36.08928270382084,
'NeuralNetTorch_BAG_L2/T2': -36.66985226852717,
'NeuralNetTorch_BAG_L2/T3': -328.4201008905082,
'WeightedEnsemble_L3': -34.81300990449738,
'LightGBM_BAG_L3/T1': -35.687376294330726,
'LightGBM_BAG_L3/T2': -35.43631366346189,
'LightGBM_BAG_L3/T3': -35.92690731081036,
'LightGBM_BAG_L3/T4': -102.00696413506348,
'LightGBM_BAG_L3/T5': -35.96674456603231,
'LightGBM_BAG_L3/T6': -98.96856819229102,
'LightGBM_BAG_L3/T7': -35.48265413314625,
'LightGBM_BAG_L3/T8': -35.922183617877444,
'LightGBM_BAG_L3/T9': -89.02752254500437,
'LightGBM_BAG_L3/T10': -35.802838270420175,
'NeuralNetTorch_BAG_L3/T1': -36.24850808894937,
'NeuralNetTorch_BAG_L3/T2': -37.61986783218606,
'NeuralNetTorch_BAG_L3/T3': -417.2256869636312,
'NeuralNetTorch_BAG_L3/T4': -37.12096773095651,
'WeightedEnsemble_L4': -35.111313282432405},
'model_best': 'WeightedEnsemble_L3',
'model_paths': {'LightGBM_BAG_L1/T1': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L1/T1/',
'LightGBM_BAG_L1/T2': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L1/T2/',
'LightGBM_BAG_L1/T3': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L1/T3/',
'LightGBM_BAG_L1/T4': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L1/T4/',
'LightGBM_BAG_L1/T5': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L1/T5/',
'LightGBM_BAG_L1/T6': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L1/T6/',
'LightGBM_BAG_L1/T7': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L1/T7/',
'LightGBM_BAG_L1/T8': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L1/T8/',
'LightGBM_BAG_L1/T9': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L1/T9/',
'LightGBM_BAG_L1/T10': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L1/T10/',
'NeuralNetTorch_BAG_L1/T1': 'AutogluonModels/ag-20220317_191204/models/NeuralNetTorch_BAG_L1/T1/',
'NeuralNetTorch_BAG_L1/T2': 'AutogluonModels/ag-20220317_191204/models/NeuralNetTorch_BAG_L1/T2/',
'NeuralNetTorch_BAG_L1/T3': 'AutogluonModels/ag-20220317_191204/models/NeuralNetTorch_BAG_L1/T3/',
'WeightedEnsemble_L2': 'AutogluonModels/ag-20220317_191204/models/WeightedEnsemble_L2/',
'LightGBM_BAG_L2/T1': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L2/T1/',
'LightGBM_BAG_L2/T2': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L2/T2/',
'LightGBM_BAG_L2/T3': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L2/T3/',
'LightGBM_BAG_L2/T4': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L2/T4/',
'LightGBM_BAG_L2/T5': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L2/T5/',
'LightGBM_BAG_L2/T6': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L2/T6/',
'LightGBM_BAG_L2/T7': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L2/T7/',
'LightGBM_BAG_L2/T8': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L2/T8/',
'LightGBM_BAG_L2/T9': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L2/T9/',
'LightGBM_BAG_L2/T10': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L2/T10/',
'NeuralNetTorch_BAG_L2/T1': 'AutogluonModels/ag-20220317_191204/models/NeuralNetTorch_BAG_L2/T1/',
'NeuralNetTorch_BAG_L2/T2': 'AutogluonModels/ag-20220317_191204/models/NeuralNetTorch_BAG_L2/T2/',
'NeuralNetTorch_BAG_L2/T3': 'AutogluonModels/ag-20220317_191204/models/NeuralNetTorch_BAG_L2/T3/',
'WeightedEnsemble_L3': 'AutogluonModels/ag-20220317_191204/models/WeightedEnsemble_L3/',
'LightGBM_BAG_L3/T1': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L3/T1/',
'LightGBM_BAG_L3/T2': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L3/T2/',
'LightGBM_BAG_L3/T3': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L3/T3/',
'LightGBM_BAG_L3/T4': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L3/T4/',
'LightGBM_BAG_L3/T5': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L3/T5/',
'LightGBM_BAG_L3/T6': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L3/T6/',
'LightGBM_BAG_L3/T7': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L3/T7/',
'LightGBM_BAG_L3/T8': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L3/T8/',
'LightGBM_BAG_L3/T9': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L3/T9/',
'LightGBM_BAG_L3/T10': 'AutogluonModels/ag-20220317_191204/models/LightGBM_BAG_L3/T10/',
'NeuralNetTorch_BAG_L3/T1': 'AutogluonModels/ag-20220317_191204/models/NeuralNetTorch_BAG_L3/T1/',
'NeuralNetTorch_BAG_L3/T2': 'AutogluonModels/ag-20220317_191204/models/NeuralNetTorch_BAG_L3/T2/',
'NeuralNetTorch_BAG_L3/T3': 'AutogluonModels/ag-20220317_191204/models/NeuralNetTorch_BAG_L3/T3/',
'NeuralNetTorch_BAG_L3/T4': 'AutogluonModels/ag-20220317_191204/models/NeuralNetTorch_BAG_L3/T4/',
'WeightedEnsemble_L4': 'AutogluonModels/ag-20220317_191204/models/WeightedEnsemble_L4/'},
'model_fit_times': {'LightGBM_BAG_L1/T1': 7.668991804122925,
'LightGBM_BAG_L1/T2': 7.505809783935547,
'LightGBM_BAG_L1/T3': 7.882295608520508,
'LightGBM_BAG_L1/T4': 8.223053932189941,
'LightGBM_BAG_L1/T5': 8.176471471786499,
'LightGBM_BAG_L1/T6': 7.851139545440674,
'LightGBM_BAG_L1/T7': 8.417715549468994,
'LightGBM_BAG_L1/T8': 8.737298488616943,
'LightGBM_BAG_L1/T9': 8.484714031219482,
'LightGBM_BAG_L1/T10': 8.17667007446289,
'NeuralNetTorch_BAG_L1/T1': 49.84962034225464,
'NeuralNetTorch_BAG_L1/T2': 85.37021088600159,
'NeuralNetTorch_BAG_L1/T3': 106.74268889427185,
'WeightedEnsemble_L2': 0.6619982719421387,
'LightGBM_BAG_L2/T1': 9.71915054321289,
'LightGBM_BAG_L2/T2': 10.044380187988281,
'LightGBM_BAG_L2/T3': 9.475686311721802,
'LightGBM_BAG_L2/T4': 9.528181076049805,
'LightGBM_BAG_L2/T5': 8.775633811950684,
'LightGBM_BAG_L2/T6': 9.175281286239624,
'LightGBM_BAG_L2/T7': 8.442466974258423,
'LightGBM_BAG_L2/T8': 10.020208597183228,
'LightGBM_BAG_L2/T9': 8.967064619064331,
'LightGBM_BAG_L2/T10': 9.454532623291016,
'NeuralNetTorch_BAG_L2/T1': 53.81954264640808,
'NeuralNetTorch_BAG_L2/T2': 88.483966588974,
'NeuralNetTorch_BAG_L2/T3': 135.68245697021484,
'WeightedEnsemble_L3': 0.6654980182647705,
'LightGBM_BAG_L3/T1': 9.041711568832397,
'LightGBM_BAG_L3/T2': 8.884714126586914,
'LightGBM_BAG_L3/T3': 8.997465372085571,
'LightGBM_BAG_L3/T4': 8.312356948852539,
'LightGBM_BAG_L3/T5': 9.150264739990234,
'LightGBM_BAG_L3/T6': 9.239630460739136,
'LightGBM_BAG_L3/T7': 8.280794858932495,
'LightGBM_BAG_L3/T8': 9.50151252746582,
'LightGBM_BAG_L3/T9': 9.190340280532837,
'LightGBM_BAG_L3/T10': 8.715929985046387,
'NeuralNetTorch_BAG_L3/T1': 50.67495560646057,
'NeuralNetTorch_BAG_L3/T2': 86.68638157844543,
'NeuralNetTorch_BAG_L3/T3': 104.70487380027771,
'NeuralNetTorch_BAG_L3/T4': 132.613214969635,
'WeightedEnsemble_L4': 0.7047362327575684},
'model_pred_times': {'LightGBM_BAG_L1/T1': 0.1071162223815918,
'LightGBM_BAG_L1/T2': 0.10892724990844727,
'LightGBM_BAG_L1/T3': 0.1319584846496582,
'LightGBM_BAG_L1/T4': 0.09445405006408691,
'LightGBM_BAG_L1/T5': 0.10509729385375977,
'LightGBM_BAG_L1/T6': 0.08769512176513672,
'LightGBM_BAG_L1/T7': 0.11373734474182129,
'LightGBM_BAG_L1/T8': 0.12036967277526855,
'LightGBM_BAG_L1/T9': 0.08572888374328613,
'LightGBM_BAG_L1/T10': 0.12586212158203125,
'NeuralNetTorch_BAG_L1/T1': 0.29877161979675293,
'NeuralNetTorch_BAG_L1/T2': 0.2633357048034668,
'NeuralNetTorch_BAG_L1/T3': 0.8504812717437744,
'WeightedEnsemble_L2': 0.0007562637329101562,
'LightGBM_BAG_L2/T1': 0.11679196357727051,
'LightGBM_BAG_L2/T2': 0.09779000282287598,
'LightGBM_BAG_L2/T3': 0.12296319007873535,
'LightGBM_BAG_L2/T4': 0.09633302688598633,
'LightGBM_BAG_L2/T5': 0.10200190544128418,
'LightGBM_BAG_L2/T6': 0.09978389739990234,
'LightGBM_BAG_L2/T7': 0.08044576644897461,
'LightGBM_BAG_L2/T8': 0.08267331123352051,
'LightGBM_BAG_L2/T9': 0.10734701156616211,
'LightGBM_BAG_L2/T10': 0.05284285545349121,
'NeuralNetTorch_BAG_L2/T1': 0.412351131439209,
'NeuralNetTorch_BAG_L2/T2': 0.30968546867370605,
'NeuralNetTorch_BAG_L2/T3': 1.2547607421875,
'WeightedEnsemble_L3': 0.0010199546813964844,
'LightGBM_BAG_L3/T1': 0.11748147010803223,
'LightGBM_BAG_L3/T2': 0.08839154243469238,
'LightGBM_BAG_L3/T3': 0.11085057258605957,
'LightGBM_BAG_L3/T4': 0.09061002731323242,
'LightGBM_BAG_L3/T5': 0.09770870208740234,
'LightGBM_BAG_L3/T6': 0.12039971351623535,
'LightGBM_BAG_L3/T7': 0.06883001327514648,
'LightGBM_BAG_L3/T8': 0.06472325325012207,
'LightGBM_BAG_L3/T9': 0.08718371391296387,
'LightGBM_BAG_L3/T10': 0.07236838340759277,
'NeuralNetTorch_BAG_L3/T1': 0.4343576431274414,
'NeuralNetTorch_BAG_L3/T2': 0.2846488952636719,
'NeuralNetTorch_BAG_L3/T3': 0.9968357086181641,
'NeuralNetTorch_BAG_L3/T4': 0.607987642288208,
'WeightedEnsemble_L4': 0.0006947517395019531},
'num_bag_folds': 5,
'max_stack_level': 4,
'model_hyperparams': {'LightGBM_BAG_L1/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1/T2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1/T3': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1/T4': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1/T5': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1/T6': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1/T7': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1/T8': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1/T9': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1/T10': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'NeuralNetTorch_BAG_L1/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'NeuralNetTorch_BAG_L1/T2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'NeuralNetTorch_BAG_L1/T3': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L2': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2/T2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2/T3': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2/T4': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2/T5': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2/T6': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2/T7': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2/T8': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2/T9': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2/T10': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'NeuralNetTorch_BAG_L2/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'NeuralNetTorch_BAG_L2/T2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'NeuralNetTorch_BAG_L2/T3': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L3': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L3/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L3/T2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L3/T3': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L3/T4': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L3/T5': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L3/T6': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L3/T7': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L3/T8': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L3/T9': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L3/T10': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'NeuralNetTorch_BAG_L3/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'NeuralNetTorch_BAG_L3/T2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'NeuralNetTorch_BAG_L3/T3': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'NeuralNetTorch_BAG_L3/T4': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L4': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True}},
'leaderboard': model score_val pred_time_val fit_time \
0 WeightedEnsemble_L3 -34.813010 3.672309 513.167733
1 WeightedEnsemble_L4 -35.111313 6.914216 982.520030
2 LightGBM_BAG_L2/T8 -35.370413 2.576208 333.106889
3 LightGBM_BAG_L2/T3 -35.379546 2.616498 332.562367
4 LightGBM_BAG_L2/T1 -35.399568 2.610327 332.805831
5 LightGBM_BAG_L3/T2 -35.436314 5.517697 703.559947
6 LightGBM_BAG_L2/T10 -35.479069 2.546378 332.541213
7 LightGBM_BAG_L3/T7 -35.482654 5.498135 702.956028
8 LightGBM_BAG_L2/T7 -35.483605 2.573981 331.529147
9 WeightedEnsemble_L2 -35.548268 0.510324 102.946178
10 LightGBM_BAG_L3/T1 -35.687376 5.546787 703.716944
11 LightGBM_BAG_L2/T5 -35.701899 2.595537 331.862314
12 LightGBM_BAG_L2/T2 -35.738677 2.591325 333.131061
13 LightGBM_BAG_L3/T10 -35.802838 5.501674 703.391163
14 LightGBM_BAG_L1/T10 -35.888562 0.125862 8.176670
15 LightGBM_BAG_L3/T8 -35.922184 5.494029 704.176745
16 LightGBM_BAG_L3/T3 -35.926907 5.540156 703.672698
17 LightGBM_BAG_L3/T5 -35.966745 5.527014 703.825497
18 NeuralNetTorch_BAG_L2/T1 -36.089283 2.905886 376.906223
19 NeuralNetTorch_BAG_L3/T1 -36.248508 5.863663 745.350188
20 NeuralNetTorch_BAG_L2/T2 -36.669852 2.803221 411.570647
21 LightGBM_BAG_L1/T8 -37.071398 0.120370 8.737298
22 NeuralNetTorch_BAG_L3/T4 -37.120968 6.037293 827.288448
23 NeuralNetTorch_BAG_L3/T2 -37.619868 5.713954 781.361614
24 LightGBM_BAG_L1/T7 -37.815541 0.113737 8.417716
25 LightGBM_BAG_L1/T2 -38.660252 0.108927 7.505810
26 LightGBM_BAG_L1/T3 -39.344382 0.131958 7.882296
27 LightGBM_BAG_L1/T1 -39.956017 0.107116 7.668992
28 LightGBM_BAG_L1/T5 -42.962833 0.105097 8.176471
29 NeuralNetTorch_BAG_L1/T2 -55.209600 0.263336 85.370211
30 NeuralNetTorch_BAG_L1/T1 -63.861354 0.298772 49.849620
31 LightGBM_BAG_L3/T9 -89.027523 5.516489 703.865573
32 LightGBM_BAG_L2/T9 -89.278716 2.600882 332.053745
33 LightGBM_BAG_L3/T6 -98.968568 5.549705 703.914863
34 LightGBM_BAG_L2/T6 -99.200054 2.593319 332.261962
35 LightGBM_BAG_L3/T4 -102.006964 5.519915 702.987590
36 LightGBM_BAG_L2/T4 -102.237873 2.589868 332.614861
37 LightGBM_BAG_L1/T9 -105.416537 0.085729 8.484714
38 LightGBM_BAG_L1/T6 -108.880311 0.087695 7.851140
39 LightGBM_BAG_L1/T4 -117.026282 0.094454 8.223054
40 NeuralNetTorch_BAG_L2/T3 -328.420101 3.748296 458.769137
41 NeuralNetTorch_BAG_L1/T3 -405.762688 0.850481 106.742689
42 NeuralNetTorch_BAG_L3/T3 -417.225687 6.426141 799.380106
pred_time_val_marginal fit_time_marginal stack_level can_infer \
0 0.001020 0.665498 3 True
1 0.000695 0.704736 4 True
2 0.082673 10.020209 2 True
3 0.122963 9.475686 2 True
4 0.116792 9.719151 2 True
5 0.088392 8.884714 3 True
6 0.052843 9.454533 2 True
7 0.068830 8.280795 3 True
8 0.080446 8.442467 2 True
9 0.000756 0.661998 2 True
10 0.117481 9.041712 3 True
11 0.102002 8.775634 2 True
12 0.097790 10.044380 2 True
13 0.072368 8.715930 3 True
14 0.125862 8.176670 1 True
15 0.064723 9.501513 3 True
16 0.110851 8.997465 3 True
17 0.097709 9.150265 3 True
18 0.412351 53.819543 2 True
19 0.434358 50.674956 3 True
20 0.309685 88.483967 2 True
21 0.120370 8.737298 1 True
22 0.607988 132.613215 3 True
23 0.284649 86.686382 3 True
24 0.113737 8.417716 1 True
25 0.108927 7.505810 1 True
26 0.131958 7.882296 1 True
27 0.107116 7.668992 1 True
28 0.105097 8.176471 1 True
29 0.263336 85.370211 1 True
30 0.298772 49.849620 1 True
31 0.087184 9.190340 3 True
32 0.107347 8.967065 2 True
33 0.120400 9.239630 3 True
34 0.099784 9.175281 2 True
35 0.090610 8.312357 3 True
36 0.096333 9.528181 2 True
37 0.085729 8.484714 1 True
38 0.087695 7.851140 1 True
39 0.094454 8.223054 1 True
40 1.254761 135.682457 2 True
41 0.850481 106.742689 1 True
42 0.996836 104.704874 3 True
fit_order
0 28
1 43
2 22
3 17
4 15
5 30
6 24
7 35
8 21
9 14
10 29
11 19
12 16
13 38
14 10
15 36
16 31
17 33
18 25
19 39
20 26
21 8
22 42
23 40
24 7
25 2
26 3
27 1
28 5
29 12
30 11
31 37
32 23
33 34
34 20
35 32
36 18
37 9
38 6
39 4
40 27
41 13
42 41 }
predictions_new_hpo = predictor_new_hpo.predict(test)
predictions.head()
0 24.718138 1 41.588139 2 46.161873 3 49.389935 4 52.169258 Name: count, dtype: float32
# How many negative values do we have?
len(predictions_new_hpo[predictions_new_hpo < 0])
0
# Set them to zero (just here in case I rerun and we have negative values)
predictions_new_hpo[predictions_new_hpo < 0] = 0
# Same submitting predictions
submission_new_hpo = pd.read_csv('sampleSubmission.csv', parse_dates=['datetime'])
submission_new_hpo["count"] = predictions_new_hpo
submission_new_hpo.to_csv("submission_new_hpo.csv", index=False)
!kaggle competitions submit -c bike-sharing-demand -f submission_new_hpo.csv -m "new features with hyperparameters"
100%|█████████████████████████████████████████| 188k/188k [00:00<00:00, 482kB/s] Successfully submitted to Bike Sharing Demand
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6
fileName date description status publicScore privateScore --------------------------- ------------------- --------------------------------- -------- ----------- ------------ submission_new_hpo.csv 2022-03-17 19:53:54 new features with hyperparameters complete 0.45461 0.45461 submission_new_features.csv 2022-03-17 19:06:19 new features complete 0.67065 0.67065 submission.csv 2022-03-17 16:28:48 first raw submission complete 1.80373 1.80373 submission.csv 2022-03-14 16:28:05 first raw submission complete 1.80918 1.80918
The results are even better than last time.
# Taking the top model score from each training run and creating a line plot to show improvement
# You can create these in the notebook and save them to PNG or use some other tool (e.g. google sheets, excel)
fig = pd.DataFrame(
{
"model": ["initial", "add_features", "hpo"],
"score": [?, ?, ?]
}
).plot(x="model", y="score", figsize=(8, 6)).get_figure()
fig.savefig('model_train_score.png')
import matplotlib.pyplot as plt
# Training scores (best model)
fig = pd.DataFrame(
{
"train_eval": ["initial", "add_features", "hpo"],
"score": [52.9, 30.2, 34.8]
}
).plot(x="train_eval", y="score", figsize=(8, 6)).get_figure()
plt.ylabel("RMSE")
fig.savefig('./img/model_train_score.png')
# Take the 3 kaggle scores and creating a line plot to show improvement
fig = pd.DataFrame(
{
"test_eval": ["initial", "add_features", "hpo"],
"score": [1.80, 0.67, 0.45]
}
).plot(x="test_eval", y="score", figsize=(8, 6)).get_figure()
plt.ylabel('RMSLE')
fig.savefig('./img/model_test_score.png')
# The 3 hyperparameters we tuned with the kaggle score as the result
pd.DataFrame({
"model": ["initial", "add_features", "hpo"],
"hpo1": ['Default settings', 'Default settings', 'NN OPTIONS: {num_epochs: 30, learning_rate: [1e-5, 1e-1], activation: [relu, softrelu, sigmoid, tanh], dropout_prob: [0.0, 0.5]'],
"hpo2": ['Default settings', 'Default settings', 'GBM options: {num_boost_round: 100, num_leaves: [26,66]}'],
"hpo3": ['Default settings', 'Default settings', 'Default settings'],
"score": [1.80, 0.67, 0.45]
})
| model | hpo1 | hpo2 | hpo3 | score | |
|---|---|---|---|---|---|
| 0 | initial | Default settings | Default settings | Default settings | 1.80 |
| 1 | add_features | Default settings | Default settings | Default settings | 0.67 |
| 2 | hpo | NN OPTIONS: {num_epochs: 30, learning_rate: [1e-5, 1e-1], activation: [relu, softrelu, sigmoid, tanh], dropout_prob: [0.0, 0.5] | GBM options: {num_boost_round: 100, num_leaves: [26,66]} | Default settings | 0.45 |